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But peace to vain regrets! We see but darkly 

Even when we look behind us, and best things 

Are not so pure by nature that they needs 

Must keep to all, as fondly all believe, 

Their highest promise. If the mariner, 

When at reluctant distance he hath passed 

Some tempting island, could but know the ills 

That must have fallen upon him had he brought 

His bark to land upon the wished-for shore, 

Good cause would oft be his to thank the surf 

Whose white belt scared him thence, or wind that blew 

Inexorably adverse: for myself 

I grieve not; happy is the gowned youth, 

Who only misses what I missed, who falls 

No lower than I fell. 

The Prelude, or Growth of a Poet's Mind 
William Wordsworth, 1850 
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Abstract 



This thesis studies questions of type inference, unification and elaboration for lan- 
guages that combine dependent type theory and functional programming. Lan- 
guages such as modern Haskell have very expressive type systems, allowing the 
programmer a great deal of freedom. These require advanced type inference and 
unification algorithms to reconstruct details that were left implicit, and suitable 
representation of the evidence delivered by such algorithms. 

The first part proposes an approach to unification and type inference, based on 
information increase in dependency-ordered contexts, and keeping careful track 
of variable scope. Two existing systems are reviewed: the Hindley-Milner type 
system, and units of measure in the style of Kennedy. Subtle issues relating to 
let-generalisation become clearer as a result. Using the same approach, an algo- 
rithm is described for Miller pattern unification in a full-spectrum dependent type 
theory, forming a foundation for the elaboration of dependently typed languages. 

The second part introduces inch, a language that extends Haskell with type- 
level data and functions, and dependent product types. Type-level numbers and 
arithmetic operations are specifically considered, as a particularly useful source 
of applications, such as the perennial example of vectors (length- indexed lists). 
The increased expressivity in the source language is matched by a suitable core 
language of evidence, into which inch programs can be translated. This language 
is based on System F c , the existing core language used by GHC, adapted to 
clarify the relationships between the type and term levels. It gives a coherent op- 
erational semantics to both levels, allowing shared data and dependent functions, 
but retaining a clear phase distinction. The contextual approach of the first part 
of the thesis is used to specify the elaboration of inch into the evidence language, 
and applications of inch based on type-level arithmetic are demonstrated. 



Part I 

Foundations of type inference 



Chapter 1 
Introduction 



This thesis explores the combination of the functional programming language 
Haskell with dependent type theory. It is addressed to the functional program- 
mer who wants a language that provides stronger static guarantees and a more 
expressive type system than modern Haskell, while maintaining the phase dis- 
tinction and useful, if not necessarily complete, type inference. I will assume the 
reader has some familiarity with Haskell or a similar functional language, but not 
necessarily a great deal of familiarity with type theory. Experience of advanced 
type system features such as generalised algebraic datatypes and higher-rank 
types would be beneficial. 

Haskell is a functional language with Hindley-Milner type inference in the 
tradition of ML. Thanks to type inference, the burden of type annotations is 
minimised, if not necessarily eliminated. 1 Moreover, the typeclass system enables 
term inference: types function as a real aid to the programmer, not just a safety 
net that prevents bad programs, as the compiler can write runtime code for the 
user. For example, Haskell's Eq typeclass can be used to compute an equality 
test for complex structured data from the equality tests on the component types. 

Dependent types allow term-level data into the static type system. This allows 
more precise invariants to be specified: for example, rather than the type of lists 
of arbitrary length, one can work with the type of vectors of a statically-known 
length. Term inference becomes easier, because the presence of terms in types 
leads to equational constraints on terms, and solving these constraints may allow 
the compiler to discover runtime-relevant values. While typeclasses allow terms 
to be discovered by evaluating logic programs, dependent types allow them to be 
discovered by solving equations in the underlying functional language. 

x l include approaches requiring a little annotation, sometimes called 'type reconstruction', 
under the general term 'type inference'. Type inference is pure if no annotations are required. 



Haskell is a good basis for extension with dependent types because it is already 
widely used as a testbed for type system extensions. Numerous advanced features, 
that push the boundaries of type inference, have been adopted in the Glasgow 
Haskell Compiler (GHC): notably higher-rank types, which allow universal quan- 
tification in the domain of a function, and generalised algebraic datatypes, which 
allow data constructors to introduce equational constraints on types. While such 
extensions make pure type inference infeasible, this can be a price worth paying, 
particularly given the huge increase in expressivity achieved, the potential for 
term inference and the value of annotations as machine-checked documentation. 

1.0.1 Contexts, variable scope and let-generalisation 

One of the main themes of this thesis is the proper management of variable 
scope, which is crucial for correctly implementing type inference. Type inference 
algorithms create existential variables to stand for unknown type expressions, 
then solve for these variables by unification. Once higher-rank types are available, 
it is necessary to carefully manage which universally quantified variables are in 
scope for each existential variable. Even in the Hindley-Milner system, however, 
variable dependencies are key to understanding the process of let-generalisation. 
Let-generalisation is used to assign polymorphic types to definitions. In 

let / x = (x, x) in (/ True,/ 3) :: ((Bool, Bool), (Int, Int)) 

the term is well-typed because / is assigned the type Va . a — > (a, a). This type 
is determined by inferring the type j3 — > ((3, (3), where j3 is an existential variable, 
then quantifying over j3. In more complex examples it is not always possible to 
quantify over all the existential variables, as they may have meaning outside the 
local scope of the let-binding. This will be examined in more detail in Chapter 2. 

All this motivates taking more care over metavariables than is traditional 
for the Hindley-Milner system. I will introduce a notion of context that tracks 
metavariable declarations and imposes a dependency-respecting order upon them. 
Considering contextualised unification and type inference problems leads to a 
precise notion of the minimal commitment necessary to solve a problem, and 
reveals the underlying structure that makes sense of the let-generalisation step. 
This structure makes it easier to deal with systems where variable dependency 
is more subtle than in Hindley-Milner, such as units of measure in the style of 
Kennedy (2010), considered in Chapter 3. Contexts can be extended to contain 
universal as well as existential variables, a 'mixed prefix' in the language of Miller 
(1992), allowing the analysis to be extended to dependent types, as in Chapter 4. 
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1.0.2 Dependent types in GHC Haskell 

Simulating dependent types in Haskell is a cottage industry (McBride, 2002), 
and recent extensions to GHC allow some dependent datatypes to be defined 
reasonably neatly. The standard example of vectors of a fixed length is given by: 

data N = Zero | Sue N 

data Vec :: * — > N — > * where 
Nil :: Vec a Zero 

Cons :: a — > Vec a re — > Vec a (Sue re) 

Datatype promotion (Yorgey et al., 2012) allows the datatype N to be used in the 
kind of Vec, and correspondingly the Zero and Sue data constructors appear in 
the types of Nil and Cons. Moreover, Vec amisa generalised algebraic datatype 
or GADT (Peyton Jones et al., 2006), meaning that pattern matching on its 
constructors supplies information to the typechecker: a proof of the equation 
m ~ Zero in the Nil branch, and a proof of to ~ Sue n in the Cons branch. 

This type-level knowledge of length is useful for expressing more precise in- 
variants in types, leading to more reliable code. The tail function for vectors 

tail :: Vec a (Sue re) — > Vec a re 
tail (Cons _ xs) = xs 

statically enforces the invariant that its argument list must be non-empty, so this 
definition is total, and it is guaranteed to return a result of the right length. 

Type families (Chakravarty et al., 2005), which approximate functions on the 
type level, allow the definition of operations on type-level data. Addition for 
type-level naturals can be defined, then used in the type of vector concatenation: 

type family (to :: N) + (re :: N) :: N 

type instance Zero + n = n 

type instance Sue to + n = Sue (to + re) 

append :: Vec a m — > Vec a n — > Vec a (to + re) 

append Nil ys = ys 

append (Cons x xs) ys = Cons x (append xs ys) 

However, type families do not correspond exactly to term-level functions, because 
they are open, that is, defining equations can be added anywhere. They are not 
translated into case analysis, but are understood as rewrite rules on the syntax 
of type expressions. This gap between the term-level and type-level operational 
semantics is problematic for dependent types, where the same expression may be 
used both statically (in the typechecker) and dynamically (at runtime). 
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1.0.3 The value of II: going beyond GHC Haskell 

Vector concatenation relies only on (implicit) universal quantifiers and runtime 
functions. However, consider the vector version of the replicate function, which 
creates a vector of length n by repeating its second argument n times: 

replicate :: II (n :: N) — > a — > Vec a n 

replicate Zero _ = Nil 

replicate (Sue n) x — Cons x (replicate n x) 

Here the result type Vec a n depends on n, but the operational behaviour of the 
function also makes uses of n, as it is defined by pattern matching. This shows 
the need for the dependent product n: it is a function space where the value 
is available both statically and dynamically. GHC Haskell does not currently 
support n, but it can be encoded in some cases. Adding n to Haskell is the main 
contribution of part II of this thesis. Chapter 5 describes the resulting language. 

1.0.4 Type inference and term inference 

Dependent type theory offers a significant extension of the verification that can 
be performed by types: ultimately, the full power of constructive mathematics 
can be used to specify and prove properties of programs. However, this power 
comes at a cost. Inferring the most general type of the composition operator 

(9°f)x = g(Jx) 

is straightforward in Haskell, where it has type 

(b ->■ c) ->■ (a ->• b) ->• (a ->■ c) . 

If the codomain of / may depend on the value of x, and g may depend on x and 
/ x, then the type becomes more complicated. A possible type for composition is 

{A: Set} {B: A ->■ Set} {C : (a: A) -> B a ->• Set} 

(g : {a : A} (b : B a) ^ C a b) (f : (a : A) ^ B a) (a : A) ^ C a (f a) 

in Agda notation, 2 ignoring universe polymorphism. It is not reasonable to ask 
a machine to reconstruct this type from the definition. 

As I have noted, types are not simply a form of statically-checked documen- 
tation or a policing system that prohibits bad programs, important as these roles 

2 A dependent function space (H-type) is written (x : S) — >• T or {x : S} — > T, where x is 
bound in T. The type Set is a universe of small types, resembling the Haskell kind *. 
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are. In exchange for writing more expressive types, the programmer can be re- 
paid by having to write less of their program: term inference becomes feasible. 
Typeclasses accomplish this to a certain extent, but the presence of computational 
data in types means that constraints on types can determine runtime information. 
The replicate function defined in Subsection 1.0.3 makes crucial runtime use of its 
natural number argument. If it is used in a context demanding a value of type 
Vec a 42, the programmer should not need to supply that argument explicitly! 

Users of a dependently typed language, if they wish to prove properties of 
their programs, have much work to do in choosing appropriate representations of 
data structures and ways to enforce invariants. On the other hand, significant 
benefits can be gained with less work by selectively establishing invariants that 
use the type system to prevent certain errors, guaranteeing the absence of a class 
of bugs, if not the absence of bugs altogether. Perhaps the way forward lies 
in a mixed economy: a system that combines the flexibility of Haskell with the 
reliability of dependent type theory. This is the approach that I will pursue. 

1.1 Outline 

This thesis falls into two parts: the first develops foundations for describing and 
analysing type inference, and the second builds on this work to introduce the inch 
system, extending Haskell with dependent types. Reference implementations of 
the algorithms in part I and details of selected proofs are given in the appendices. 

Part I: Foundations of type inference 

In Chapter 2, I start at the very beginning with a rationalised reconstruction 
of type inference for the Hindley-Milner type system, and its constraint-solving 
algorithm, first-order unification. This introduces a method of contextualised 
problem-solving that sustains the later development. Paying careful attention 
to variable scope makes evident the underlying structure on first-order unifica- 
tion that explains let-generalisation. Furthermore, I describe how to elaborate 
Hindley-Milner terms into System F, representing term structure in the context. 

Following on from this in Chapter 3, I extend the basic Hindley-Milner system 
with Kennedy-style units of measure. This requires unification in the equational 
theory of abelian groups. I show how the contextual structure introduced in 
Chapter 2 makes let-generalisation straightforward, even in this more complex 
setting where variable occurrence does not imply dependency on that variable. 
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Taking a different direction in Chapter 4, I apply the same techniques of con- 
textualised problem-solving to higher-order unification, where the correct man- 
agement of scope is crucial. I describe an algorithm for Miller pattern unification 
in a full-spectrum dependent type theory. Higher-order unification is needed for 
implementing type inference for dependently-typed programming languages, as 
constraint-solving must take place in the definitional equality of the type theory. 
Not all equations can be solved immediately, so the algorithm must represent 
constraints explicitly and make most general progress where possible. 

Part II: Haskell with dependent types 

Having constructed the foundations, I build on them in the second part to create 
inch, a language based on Haskell with n-types and type-level data. In Chapter 5, 
I introduce the main features of the language by example and compare it to related 
work. This chapter contains a more thorough introduction to the encoding of 
dependently-typed programs in Haskell via GADTs and type families. 

To explain inch formally, I build an evidence language in Chapter 6, based 
on GHC's intermediate language System Fc, but influenced by Martin-L6f Type 
Theory. The evidence language is a very explicit calculus for which typechecking is 
straightforward. I give a precise account of the phase distinction, as n means that 
the categories of runtime and type-level data are no longer mutually exclusive. 
The operational semantics of the evidence language, with type safety proof, makes 
explicit the computational role of dependent n-types. Also, I present a new 
approach to proving consistency of coercions (which witness type equalities). 

In Chapter 7, I describe type inference for inch via elaboration into the evi- 
dence language, using the ideas of contextualised problem-solving from the first 
part of the thesis. In particular, the elaboration algorithm clarifies the man- 
agement of implicit and explicit arguments. Elaboration relies on an underlying 
constraint solver, which I do not study in detail, though it would use similar 
techniques to the unification algorithms from Part I. 

The payoff for all this work appears in Chapter 8, where I present applications 
of inch, using dependent types to provide stronger guarantees of correctness. I 
give examples of vector functions, merge sort and red-black tree insertion and 
deletion, and show how the time complexity of such programs can be statically 
checked. Additionally, I demonstrate an approach to units of measure as a library 
based on type-level integers, in contrast to the built-in treatment in Chapter 3. 

Finally, some concluding remarks form Chapter 9. 
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Chapter 2 

A rationalised reconstruction of 
Hindley-Milner type inference 

In this chapter I rebuild first-order unification and Hindley-Milner type infer- 
ence from the ground up. A key theme of this thesis is the proper understand- 
ing of scope, achieved by keeping variables (especially 'unification variables' or 
'metavariables') in contexts. Applying the variables-in-contexts approach to a 
standard type inference problem allows me to emphasise this theme, before mov- 
ing on to more advanced type systems. This chapter is based on the paper "Type 
inference in context" by Gundry, McBride, and McKinna (2010). Appendix A 
(page 197) contains a Haskell implementation of the algorithm described here. 

The Hindley-Milner type system 1 (Milner, 1978) consists of the simply-typed 
A-calculus plus 'let-expressions' for polymorphic definitions. For example, 

let x = Xy. y in x x 

is well-typed: x is given the polymorphic type Vet. a — > a, which is instantiated in 
two different ways, first at type (/3 — > 0) — > (/3 — > (5) and second at type f3 — >■ /3. 
In contrast, A-bound variables are monomorphic, so \x.xx is ill-typed. 
The syntax of terms and types is 

t, s ::= x | Xx.t | st \ \etx = sint 
t, v ::= a \ r — > v 

where x and y range over term variables, and a and j3 range over type variables. 
For simplicity, the function arrow — > is the only type constructor. 

1 The work of Hindlcy (1969) was in type inference for combinatory logic, unlike Milner's 
type system with let-polymorphism, but 'Hindley-Milner' is the name that has stuck. 



To handle let-polymorphism, the context assigns each term variable a type 
scheme a rather than a monomorphic type. A type scheme is a type wrapped in 
one or more V-quantified variables, with the syntax 

a ::= r | Va. a 

Morally, one should distinguish between the 'universally quantified' variables in 
type schemes, and 'existentially quantified' variables (known as 'metavariables', 
'unification variables' or 'holes') for which solutions are found by unification dur- 
ing type inference. However, for this chapter I can conflate the two: variables are 
always bound in type schemes, while metavariables are always free in the context. 

Milner's typing rules, as presented by Clement et al. (1986) adapted into 
algorithmic form, appear in Figure 2.1. The context A is an unordered set of 
type scheme bindings, with A x denoting l A minus any x binding': such contexts 
do not reflect lexical scope, so shadowing requires deletion and reinsertion. 

Algorithm W is a well-known type inference algorithm for the Hindley-Milner 
system, due to Damas and Milner (1982), and based on the Unification Algo- 
rithm of Robinson (1965). Most presentations of Algorithm W have treated the 
underlying unification algorithm as a 'black box', but by considering both to- 
gether I will show that the generalisation step (used when inferring the type of a 
let-expression) becomes straightforward (Section 2.3). 

Why revisit Algorithm W? As a first step towards a larger goal: explaining 
how to elaborate high-level dependently typed programs into fully explicit calculi, 
as in Chapter 7. Just as W specialises polymorphic type schemes, elaboration 
involves inferring implicit arguments by solving constraints, but with fewer algo- 
rithmic guarantees. Pragmatically, we need to account for stepwise progress in 
problem solving from states of partial knowledge. I seek local correctness criteria 
for type inference that guarantee global correctness. 

2.0.1 The occurs check 

Testing whether a variable occurs in a term is used by both Robinson unifica- 
tion and Algorithm W. In unification, the check is usually necessary to ensure 
termination, let alone correctness: the equation a = a — > (3 has no finite solu- 
tion because the right-hand side depends on the left, so it does not make a good 
definition for a. 2 

2 Of course, this assumes types are inductively defined: coinductive systems, which allow 
infinitary types as the solutions of such equations, are outside the scope of this thesis. 
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A\- t:a (term t has type scheme a under assumptions A) 



x:aeA a^T AY- t:r' -»• r AV~ t' :r A x U {x:r'} h t:r 

Ahi':r' <r = gen(A, r') A x U {x:<7} h t:r 
A h leta; = i / ini:r 

a" b r if r is a generic instance of a (specialising a yields r) 



gen(A, r) = 



VcT el " n . r (FV(t) \ FV(A) = {«!,..., a n }) 
r (FF(r) \ FF(A) = 0) 

Figure 2.1: Milner's typing rules 



In Algorithm W, the occurs check is used to discover type dependencies just 
in time for generalisation. When inferring the type of let x — t' in t, the type 
of t! must first be inferred, then 'generic' type variables, those occurring in t! 
but not the enclosing bindings, must be quantified over. The idea is that type 
variables may be generalised over (and freely substituted) if they are not recording 
a necessary coincidence. For example, a typing derivation for Ay. let x = y in x 
might have {y.a} h y.a for the definiens. One is certainly not free to generalise 
over a, as this would allow any type to be assigned to x\ On the other hand, a 
derivation for let x = \y. y in x x could include 0 h Xy.y.a — > a, and a must be 
generalised over for the whole expression to be well-typed. 

In both unification and type inference, the occurs check is used to detect 
dependencies between variables. The traditional approach of leaving unification 
variables floating in space, without any structure, works for the Hindley-Milner 
system because there are no scoping conditions on candidate solutions for vari- 
ables. This will not always be the case, so it is better to expose the structure and 
manage dependencies explicitly. 

In further contrast to other presentations of unification and Hindley-Milner 
type inference, the algorithm I will describe is based on contexts carrying variable 
definitions as well as declarations. This allows the context to record the entire 
result of the algorithm. 
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2.1 A framework for contextual problem solving 



Let me begin by revisiting unification for type expressions with free variables. 
In order to address the problem of solving equations, I must first explain which 
types are considered equal, raising the question of which things a given context 
admits as types, and which contexts make sense in the first place. 

A context 0 is a dependency-ordered list of unknown type metavariables, 
definitions of metavariables and given term variables: 

6 ::= • | Q,a:* | 6,a:=r : * | Q,x:a | 6, 

It is divided into 'localities' by the , marker, the role of which will be explained 
in Subsection 2.1.2. I write S for a context suffix containing only metavariables. 

Contexts introduce named variables and ascribe properties to them, but the 
properties should first make sense. The rules in Figure 2.3 define the judgment 
6 h ctx, which checks that a context is valid, i.e. that every variable is distinct 
and each property is well- formed for the preceding context. Definitions a:=r : * 
and term variable bindings x : a make sense only if the type r or scheme a is 
well-scoped, as verified by the judgment 6 h a:*. 

For example, the context a : *, j3 : *, x : a — > (3 is valid, while x : a, a : * is not, 
because a is not in scope for x. This dependency-ordering means that entries on 
the right are harder to depend on, and correspondingly easier to generalise. 

Variables must not be duplicated in a context. In the rules, a#6 means a is 
fresh for (does not occur in) 6. I will usually ignore freshness issues: in practice, 
locally nameless representations (McBride and McKinna, 2004) are sufficient. 

Metavariables definitions induce a nontrivial equational theory on types, as 
given in Figure 2.3. The definitions in a context represent a substitution in 
'triangular form' (Baader and Snyder, 2001), that can be applied on demand to 
produce a type or type scheme that contains only unknown metavariables. 

Unification is the problem of finding definitions for metavariables in order to 
make an equation hold. Type inference involves solving unification problems and 
finding a type that makes a typing judgment hold. Solutions to both problems 
should be 'most general' in that they should make the least commitment necessary 
to solve the equation or assign a type. In the following subsections, I will make this 
more precise by introducing a general notion of 'statements' that can be judged in 
contexts, and defining the permissible 'information increases' that move a context 
toward making a statement hold. 
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Term variables x, y 
Type metavariables a, (3, 7 
Contexts 0 
Suffixes S 

Types r, v 
Type schemes a 

Terms t, s 
Statements J 



= ■ I 0, a: * I 6, a: = r : * | 6, x:o | 0, 

= • I S, a : * I S, a :=r : * 

= « I r 

= r I Va. o" 

= x I Ax.t I si I \etx = sint 

= ctx |<T:*|r = -L':*|t:o-|o-^(T / |J A J' 



Figure 2.2: Syntax 



0 h ctx 



(Q is a valid context) 



a#0 
0 h ctx 



a#0 
0 h r:* 



h ctx 0, a : * h ctx 0, a := r : * h ctx 



x#0 

0 h tr:* 0 h ctx 

0,x:o-h ctx 0, h ctx 



0 h o-:* 



0 9 a : * 0 h ctx 
©ha:* 



(cr is a well-formed type scheme in Q) 
0hr:* ©hi;:* Q,a:*\~a:* 



0 h Va. a: * 



Q \- t = v : * 



0 h r:* 



0 h r = v : * 



0 h r = t : * & \- v = t : * 

0 h ctx 0 3 a:=r : * 
0 h a = r : * 



(V and f are egna/ iypes m 0^) 

0 h Tq = ri : * 0 h ri = r 2 : * 
0 h t 0 = r 2 : * 

0 h r = r' : * 0 h t> = : * 
0hT-}TEt;4t;':* 



Figure 2.3: Rules for context validity, well- formed schemes and type equality 
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2.1.1 Modelling statements- in-context 

Having introduced contexts, now I will give a general picture of 'statements- in- 
context', allowing unification and type inference to be viewed in a uniform setting. 
A statement is an assertion that can be judged in a context, with grammar 

J ::= 

ctx context validity 

a : * well-formed type scheme 

t = v.* equivalent types 

t : a well-typed term 

a >- a' generic instantiation of type schemes 

J A J' conjunction of statements 

The rules for valid contexts, well-formed type schemes and type equality are 
given in Figure 2.3. The rules for well-typed terms and generic instantiation of 
type schemes will be given in Section 2.3 (Figures 2.6 and 2.7). The conjunction 
statement has a single introduction rule and admissible elimination rules: 

0 h J 0 h J' 0 h J A J' 0 h J A J' 



0 h J A J' 0 h J 0 h J' 

Each statement J has a corresponding sanity condition, San J, whose truth 
is necessary for J to make sense. For example, the sanity condition for a typing 
statement is that the type is well-formed. Sanity conditions cannot be presup- 
posed when writing the rules; rather, care must be taken to ensure them. The 
sanity conditions are given by the following lemma. 

Lemma 2.1 (Sanity conditions). If Q h J then 0 h San J, where 



San ctx 




ctx 




San (a: *) 


!->■ 


ctx 




San (r = v) 


!->■ 


t:* A 


v : * 


Sanger) 


!->■ 


cr : * 




San (cr >- a') 


!->■ 


<r:* A 


a 1 : * 


San (J A J') 


!->■ 


San J 


A San J' 



Proof. By structural induction on derivations. The sanity condition for the ctx 
statement is uninformative, as it merely says that 0 h ctx implies itself. □ 

Sanity conditions capture the requirements for a statement to be 'meaningful', 
before one can ask whether it is 'true' ( Martin- Lof, 1996). 
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9 ■ © 0 e ©i 



[]:.CS 



(19 is a metasubstitution from 0 O to 0^ 
^ : 8 0 □ 0i 6ihr:* 0 : 0 O E ©l B^te^:* 



(0,r/a) : © 0 ,a:* E ©i 

0 : ©o E ©i 
0 : @o,3;:o" E ©i, x:9a, E 



(9, r/a) : 0 O , a:=t> : * E ©i 

9 : ©o E ©i 
0:©o? E©i,S 



0 = ^:0 O E ©i 



(9 and & are equivalent metasubstitutions from ©o to Qi) 

e = 9'-.e 0 Qe 1 q 1 ^t = t'-.* 

= •:• E ©l (0, r/a) = (9' , r/a) : 0 O , a : * E ©i 

^ = ^:0oE©i BihrE^:* GihrEr':* 
(0,r/a) = (9',T'/a):&o,a: = v : * E ©i 

9 = 9': ©o E ©i 9 = 9': ©o E ©i 



0 = 0 :© 0 ,x:a E ©i,x:0(x,S 



9 = 9 :©09 E ©i ;S 



Figure 2.4: Metasubstitutions 
2.1.2 An information order for contexts 

In order to describe algorithms that make incremental progress by modifying the 
context (substituting for variables or turning unknowns into definitions), I must 
specify what constitutes progress. This amounts to giving an 'information order' 
on contexts, so that increasing in the order makes a context 'more informative', 
i.e. more statements hold. 

Let @o and ©i be valid contexts. An information increase or metasubstitution 
from ©o to 0! is a finite map 9 from metavariables in 0 O to well-formed types 
in ©i, that respects the structure and dependency order of 0 O . Figure 2.4 gives 
rules for the judgment 9 : @ 0 E ©i that explains when 9 is a metasubstitution. 
This can be understood by looking at the form of @ 0 in each rule. If it is empty, 
then ©i may contain metavariable declarations S but no fixed structure. If the 
last entry in 0 O is a metavariable, then 9 must give a well-formed type in ©i to 
substitute for the metavariable, which should agree with the existing definition 
(if any). If the last entry is a term variable or , marker, then ©i must have 
the same structure. Recall that a context suffix S contains only metavariable 
declarations, not term variables or , markers, so it may always be added without 
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changing the underlying structure. 

Metasubstitutions act on types and statements in the obvious way, extending 
the action on variables 

9a i — y t if r/a G 9 

homomorphically on syntax. The identity metasubstitution i : 6 □ O' where 0' 
includes all the variables of 0, usually just written 0 C 0', replaces each variable 
with itself. A finite list of type- metavariable pairs, such as [r/a], represents a 
metasubstitution that is the identity except where specified. 

Equivalence of metasubstitutions, written 9 = 9' : 0 O Q ©i or simply 9 = 9' 
when the contexts are obvious, means that the corresponding types are equal, as 
shown in Figure 2.4. 

Stable statements 

Intuitively, substituting a type r for a metavariable a should not be able to falsify 
any existing equations. More generally, making contexts more informative should 
preserve derivability of judgments. What is it about the design of the deduction 
system that ensures this? 

A statement J is stable if it is preserved by metasubstitution, i.e., if 

©o h J and 9 : 0 O C ©i 9 J. 

That is, a simultaneous substitution on syntax extends to apply to derivations 
of stable statements: information increase is really the extension of simultaneous 
substitution from variables-and-terms to declarations-and-derivations. 

As context entries ascribe properties to variables, so statements ascribe prop- 
erties to expressions. Each entry corresponds directly to a statement: a : * and 
x:a are both entries and statements, while a: = r : * corresponds to a = r:*. A 
context entry causes the corresponding judgment to hold, that is, the rule 

0 3 J 

LOOKUP 

0 h J 

is admissible. Compare this to the variable rule of a type theory: as variables 
embed in terms, so contextual properties of variables embed in judgments. 

There is a systematic technique to ensure the stability of statements by con- 
struction of the deduction system: the only rules using information from the 
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context should correspond to LOOKUP, asserting that an entry in the context 
holds as a statement. It is then enough to check that recursive hypotheses occur 
in strictly positive positions, so they are stable by induction. 

Lemma 2.2 (Stability). IfQ 0 \-J then J is stable. 

Proof. By structural induction on derivations. □ 

Stability means that information increases are closed under composition, where 
9 2 ■ 9 1 is defined by applying # 2 to every type in 9\. 

Lemma 2.3 (Category of contexts). Contexts form a category with information 
increases as morphisms. In particular, 

B x : O 0 E ©1 and 6 2 : 6i E ©2 =>• 0 2 ■ 9 X ■ ©o E ©2- 

Proof. It is straightforward to verify that composition is associative and has iden- 
tity l. To show closure under composition, proceed by induction on 9 0 . 

If 0 O is empty, then 9\ is trivial, so 9 2 ■ 9\ is trivial. Moreover 0i consists only 
of metavariable declarations, so the same applies to 62. 

If 0 O = 9 0 , a : * then 6 1 = 9[, r/a where 9[ : 0' o E ©1 and ©x h r : *. Now 
induction gives 9 2 ■ 9[ : 0 O E ©2 and 6 2 \~ 0 2 r : * by stability, so 9 2 ■ Q\ : @o E ©2 
since 9 2 ■ (9i,r/a) = (9 2 ■ 9i), (9 2 r)/a. The case where 0 O ends with a defined 
metavariable is similar, using stability of the equality statement. 

If 9 0 = O 0 , x : a then 0! = 0^, x:6 1 a, S x and 0 1 : 0' o E ©'i- Similarly 0 2 = 
0' 2 , x: (9 2 ■ 61) a, E 2 and 9 2 : ©^ E ©' 2 - Now induction gives 9 2 ■ 9 X : 0 O E ©' 2 - □ 

Preserving structure in the context: the % separator 

The unification and type inference algorithms given later will exploit the decla- 
ration order in the context, moving declarations left as little as possible. Thus 
the rightmost entries will be the 'most local'. Moving a declaration left (making 
it 'more global') reduces the choice of solutions, but increases the visibility of the 
variable, widening its scope. The ordering constraints will be particularly useful 
for implementing type inference for the let-expressions, in order to generalise over 
'local' type variables but not 'global' variables. 

A locality is a section of a context 0 that contains only metavariables, so term 
variables and the marker \ separate localities. The definition of metasubstitution 
9 '■ @o E @i makes the localities of ©o and ©i correspond, so that declarations 
in any prefix of ©0 can be interpreted over the corresponding prefix of ©i. Thus 
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if 6 : 0 O , 6' 0 E 6 then 0 = 6i § 6i where 9\ 0O : ©o E ©i- (Here #| 6o is the 
metasubstitution 0 restricted to the metavariables in 9 0 .) 

As a consequence, moving a metavariable 'left of a , separator', into a new 
locality, is an irrevocable commitment. For example, 0 , a : *, 9' E 0, a : * , ©' 
holds but the converse direction does not. 

The 9 separators do not affect the statements that are provable in a context, 
however: 0 , ©' h J if and only if 0, 0' h J. 

Just as with , separators, given variables in the context are preserved by meta- 
substitution, and their type schemes must be updated appropriately. It would be 
possible for the definition of 9 : 0 O E ©i to require 0 X to assign a term variable 
x all the types that 0o assigns it, but allow x to become more polymorphic and 
acquire new types. For example, the identity 'information increase' 

0, x:t — > t □ 0, x:Wa. a — >■ a 

could be permitted. This notion certainly retains stability: every variable lookup 
can be simulated in the more general context. However, it allows term variables 
to be assigned arbitrarily generalised type schemes, which are incompatible with 
the known and intended value of those variables. As Wells (2002) points out, 
Hindley-Milner type inference is not in this respect compositional. He carefully 
distinguishes principal typings, given the right to demand more polymorphism, 
from Milner's principal type schemes and analyses how the language of types 
must be extended to express principal typings. 

2.1.3 Constraints: problems at ground mode 

I have described the information-increasing steps that a problem-solving algo- 
rithm can take, but how are problems themselves represented? Given any state- 
ment J for which the corresponding sanity conditions of Lemma 2.1 hold, it is 
reasonable to ask for the least information increase needed to make J hold. 

Formally, a constraint problem is a pair of a context 0o and a statement J, 
where ©o \~ San J. A solution to such a problem is then a context ©i and an 
information increase 9 : 0 O E ©i such that ©i h 9 J. Such a solution is minimal 
if, for any other solution 9' : ©o E ©', there exists a metasubstitution £ : ©' E ©i 
such that 9' = ( ■ 9 (say 9' factors through 9 with cof actor (). 

In this setting, a unification problem is a constraint problem where J is an 
equation, that is, a pair of a context ©o and an equation r = v, where ©o \~ r : * 
and ©o \~ v : *. A solution to the problem (a unifier) is given by a context ©i and 



17 



a metasubstitution 9 : 6 0 Q ©i such that Qih 9t = 9v:*. A minimal solution 
is a most general unifier. 

Information increase allows variables to become more informative either by 
definition or by substitution. The algorithms presented here exploit only the 
former, always choosing solutions of the form 9o E ©i- However, I will show the 
solutions are minimal with respect to arbitrary information increases: making 
progress by definition alone is enough to capture all possible solutions. 

Stability permits sound sequential problem solving: if #0 : ©0 E ©1 solves J 
and 0i : 0i E ©2 solves 9 0 J' then 9\ ■ 9 0 : 0 O C ©2 solves J A J'. Perhaps 
more surprisingly, composite problems acquire minimal solutions similarly. This 
allows a 'greedy' minimal commitment strategy for problem solving. 3 

Lemma 2.4 (The Optimist's lemma). If 9 0 : @ 0 C ©1 is a minimal solution of 
J and 61 : ©1 □ 0 2 is a minimal solution of 9q J' then 9± • 9 0 : ©0 E ©2 is a 
minimal solution of J A J'. 

Proof. Any solution ( : ©o E © to (©o, J A J') must solve (©o, J), and hence 
factor through 9 0 : 0 O E @i- But its cofactor solves (0i, 0o an d hence factors 
through 0i : ©i E 0 2 . □ 

I will use this lemma to prove that the unification algorithm delivers most 
general unifiers. It also expresses the underlying reason why type inference gives 
principal solutions, although a more general result is needed there, because state- 
ments have outputs and the second statement may depend on the first. 

This sequential approach to problem solving is not the only decomposition 
justified by stability. The account of unification by McAdam (1998) amounts to 
a concurrent, transactional decomposition of problems. One context is extended 
by multiple substitutions, which are then unified to produce a single substitution. 

Another reassuring property of problem solving is that minimal solutions are 
well-defined up to isomorphism. A metasubstitution 9 : 0 C ©' is an isomorphism 
if there exists 0- 1 : ©' □ 0 such that ■ 9 = 1 and 9 ■ 9' 1 = 1. The following 
lemma allows the contexts ©o and ©1 to be replaced with the isomorphic 0 and 
©', while retaining minimality. 

Lemma 2.5 (Isomorphism lemma). Suppose 0 ; 0' ; ©o and ©i are contexts, J is 
a well-formed statement in ©0 and ( : 0 C ©o and (' : ©i C ©' are isomorphisms. 
If 9 : ©o E Q 1 is a minimal solution of J then • 9 ■ £ : 0 C 0' is a minimal 
solution of (~ 1 J . 

3 Thc 'optimistic optimisation' of McBride (1999). 
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Proof. Composition gives that (' ■ 9 ■ ( : 0 E 6' is a metasubstitution, and since 
6i h 9 J we have 9' h (' (9 J) by stability (Lemma 2.2), so 9' h ((' ■ 9 ■ () (C _1 J). 
Hence (' ■ 9 ■ ( is a solution of 

To see that it is minimal, suppose 9" : 9 E @" is such that 9" h 0" (C _1 J). 
Now 0" • is a solution of J, so by minimality of 9 there must be some (" such 
that C" : Oi E ©" and C" • 0 = 0" • C*- Hence ((" • C'" 1 ) • (C • # ' 0 = ^" so the 
required cofactor is C" • C' _1 : ©' E 9". □ 



2.2 Unification for the syntactic equational the- 
ory 

Having set the scene, I will now present the unification algorithm itself. The al- 
gorithm starts by structurally decomposing a constraint into multiple constraints 
on variables, which can be solved sequentially (by the Cptimist's lemma). Each 
remaining constraint is either an equation between two variables (a flex-flex con- 
straint) or between a metavariable and another type (a flex-rigid constraint). 
Either way, it is solved by moving through the context from right to left (most 
local to most global), updating the constraint or context appropriately. 

For example, consider the context a : *, (3 : *,a' := (3 : *,7 : * and problem 
a — > (3 = a' — > (7 — > 7). This equation decomposes into two constraints on 
variables, a = a' and f3 = 7 — > 7. The first is solved thus: 

a : *, /3:*, a': = (3, 7 : *, [a = a'] 
a:*, (3:*, a': = (3, [a = a'}, 7:* 
a:*, (3:*, [a = (3], a': = (3, 7:* 

-» a:*, j3:—a, a': = (3, 7 : * 

To solve a = a', the algorithm ignores 7 since it does not occur in the 
constraint, moves past a' by updating the constraint to a = /3, then defines (3. 

Solving the flex-rigid constraint (3 = 7 — > 7 requires 7 to be moved back 
through the context, since it occurs in the constraint but cannot be instantiated: 



a 


*, 


(3:=a, 


«':=/?, 


7 : *, [/3 = 7 -> 7] 


a 


*, 


(3:=a, 


«':=/?, 


[7 : * |/5 = 7 — > 7] 


a 


*, 


(3:=a, 


[7 : * | /5 


= 7 — > 7], a' :—(3 


a 


*, 


[7 : * \ a = 7 — > 7], 


/3:=a, 


a':=(3 


7 : 


*, 


a :=7 — 1> 7, 


(3: = a, 
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Here the algorithm ignores a', moves past the definition of j3 by updating the 
constraint to a = 7 — >■ 7, then defines a after pasting in 7. In general, when 
solving an 'flex-rigid' equation between a metavariable and a type, the algorithm 
must accumulate the type's dependencies as it finds them, performing the occurs 
check to ensure a solution exists. This is how variables move outward through 
localities, acquiring a more global relevance. 

The unification algorithm is formally defined by the rules in Figure 2.5. Each 
inference rule can be read clockwise from the bottom-left: the inputs to the rule 
determine the inputs to the first premise, then the outputs from the first premise 
determine the inputs to the second premise, and so on, until the outputs from all 
the premises determine the outputs of the conclusion. 

The unify judgment 60 \~ r = v : * H ©i means that given inputs Go, r and 
v, unification succeeds with solution 0 O □ Q 1 . The inputs must satisfy the sanity 
conditions 6 0 \~ r : * and 6 0 \~ v.*. Symmetric variants of the inst and define 
rules have been omitted. 

The instantiate judgment ©o | S h a = r : * H ©1 means that given inputs 6 0 , 
S, a and r, instantiating a with r succeeds, yielding solution 0 O □ Q 1 . The idea 
is that the bar (|) represents progress in examining context elements in order, 
and S contains exactly those declarations on which r depends. Formally, the 
inputs must satisfy the following conditions, where the set fmv(r) records those 
metavariables occurring free in type r. 

Definition 2.1. The quadruple (0o,H, a, r) satisfies the input conditions if 

• 0 O \~ a : * where a is a metavariable, 

• ©o,S h r:* where r is not a metavariable, and 

• S contains only metavariable declarations f3:* with f3 G fmv(r). 

The main point of these conditions is to ensure that S contains only genuine 
dependencies of r, so moving S back in the context will not sacrifice generality. 
Observe that no rule applies to deduce 

0 O , a:*|SI-a = r:*H0i with a e fmv(r), 

where the algorithm fails. This is an occurs check failure: a and r cannot unify if 
a occurs in r, and r is not a variable. Given the single type constructor symbol 
(the function arrow — >), there are no failures due to rigid-rigid mismatch, but 
adding these will not significantly complicate matters. 

The unification algorithm is implemented in Appendix A. 2 (page 200). 
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6 0 \~ r = v : * H ©! (unifying r with v in 0 O results in Qi) 



0 O h r 0 = i>o : * H @i 61 h ri = Ui : * H 62 

DECOMPOSE 

©o r (t 0 ->• ri) = (i> 0 ->• fi) : * H ©2 

r non-variable 0 O | • \~ a = r : * H ©x 



6 0 h « e r : * H 9i 



INST 



a ^ /3 

IDLE DEFINE 



0,a:* \~ a = a : H0,«:* 0,a:*haE^:H9,a:=^:* 

©o h [r/ 7 ] a ee [r/ 7 ] /3 : * H @i 

SUBS 

@ 0 ,7:=r: * h a: = /3 : * -\ Q 1 , f y:—r : * 
©0 \~ ct = /3 : * -\ ©i a ^ 7 /3 ^ 7 

SKIP-TY 

©0,7:* \~ a = p : * H 0i, 7:* 
©0 h a ee £ : * H ©1 ©o \- a = /3 : * -\ ©i 

SKIP-TM — ; — SKIP-SEMI 



@o, h a = f3 : * H ©i, x:cr @ 0 ^ h a = /3 : * H 0 
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©o|SI-q;eet:*H0i (instantiating a with r in 0 O; S results in ®\) 

a fmv(r) 



6 0 ,q:* |H h a e r : * H 0 o ,S,a:=r : * 
©o,Sh a ee [^]r:H6i 



INST-DEFINE 



INST-SUBS 



INST-DEPEND 



0o,/3: = f :*|ShaEr:*H9i,/3:=w:* 

©0 I P : *, H h a ee r : * H ©1 /3 G fmv(r) 

©o, /?:*|Sh«Er : *H9i 

©o I S h a e r : * H ©1 a ^ (3 (3 (£ fmv(r) 

: INST-SKIP-TY 

©o,/3:* |ShoEr:*H9i,/?:* 
9 0 I 5 h a e r : * H 9i 

INST-SKIP-TM 



@o,a;:cr|ShQ;EET:*H0i,x:a" 
© 0 |HI-Q;EEr:*H@i 



©0 9 I S h a ee r : * H ©^ 



INST-SKIP-SEMI 



Figure 2.5: Algorithmic rules for unification 
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2.2.1 Correctness of syntactic unification 

The contextual problem-solving discipline I have introduced allows soundness to 
be linked with generality, showing that unification produces minimal solutions. 

Lemma 2.6 (Soundness and generality of unification). 

(a) If Qq\~ t = v : * H 61 then O 0 E @i is a minimal solution of r = v. 

(b) If ©o I S h a = t : * H @! t/ien O 0 , S □ Oi is a minimal solution of a = r. 

Proof. By induction on the structure of derivations. The key idea is that the type 
variables of ©o and Oi are the same, and whenever 9 : Go E ©' is a solution, 
the definitions made in Oi must hold as equations in 0' for the problem to be 
solved, so 9 can be rearranged to produce the necessary cofactor ( : 61 □ 6'. For 
details, see Appendix D.l (page 236). □ 

A lemma about the occurs check is needed for completeness of unification. 

Lemma 2.7 (Occurs check). Let a be a metavariable and r a non-metavariable 
type in 0 such that a G fmv(r). There is no context 0' and metasubstitution 
e-.Q^Q' such that e'\-9a = er: *. 

Proof. Suppose otherwise. By expanding definitions in 6' we have a type con- 
taining no defined metavariables that is equal to a proper subterm of itself, but 
induction on the definition of equality shows that this is impossible. □ 

Exposing the structure underlying unification makes termination of the al- 
gorithm evident (McBride, 2003). Each unification or instantiation step either 
shortens the overall context, shortens the uninspected context left of the bar (for 
instantiation) or preserves the context and decomposes types. 

Lemma 2.8 (Completeness of unification). 

(a) If 9 : Go E Qo \~ v.* A r :* and 6' h 9 v = 6 r : * ; then there is some 
context ©i such that 0o \~ v = r : * H ©1. 

(b) Moreover, if 9 : ©o,S C ©' is such that ©' h 9 a = 9r : * and the input 
conditions (Definition 2.1) are satisfied, then 0 o |SI-q; = t:*H0i. 

Proof. Since the algorithm terminates, it suffices to show that it covers every case 
such that a solution can exist. Each step preserves solutions: if the equation in a 
conclusion can be solved, so can those in its premises. The only omitted case is 

0 O , a : * I S h a = r : * H 0i with a G fmv(r), 

but Lemma 2.7 implies that this has no solutions. □ 



22 



© h t:a (term t has type scheme a in Q) 

0 h t:T->V 

0 3 x : a 0 h ctx 0, x : r h t : t> 0 h s : r 

0hr(r 0 h A:r.i:T -> v 0his:-u 

0hs:ff 0hi:Va.(J 6ht:r 

0, x:<7 h £:<t' 0,«:* h J:<r 0hr:* 0 h r = -U : * 



0 h leta; = sint:(7 / 0ht:Va.a 0hi:[r/a]ff 0ht:i) 



Figure 2.6: Declarative rules for type assignment 



(cr is more general than a' in Q) 



0 h r = v : * 
0hr^n 



a ^ fmv(cr) 
0,a:*h(r^ff' 



0 h r:* 

0 h [r/a] a y v 
0 h Va. a >- v 



Figure 2.7: Generic instantiation for type schemes 



2.3 Type inference with generalisation made easy 

The deduction rules for the typing statement t:a are given in Figure 2.6. Type 
inference involves making this statement hold, but unlike unification, the type 
should be an output of problem-solving along with the solution context. The def- 
inition of constraint problems in Subsection 2.1.3 is insufficiently general. Instead, 
each parameter in a statement has a mode, either 'input' or 'output'. 

A type inference problem consists of a context 0 O and a term t; a solution is a 
metasubstitution 9 : ©o E ©i and a type r such that 0i h t:r. Such a solution 
is most general or minimal if any other solution (9' : 0 O C 0', *u) factors through 
it with cofactor (, such that 0' h v = £ r : *. 

Similarly, a type scheme inference problem consists of a context 0 O and a 
term £; a solution is a metasubstitution 6* : ©o E ©i and a scheme a such that 
0i h i : <T. Such a solution is most general or minimal if any other solution 

: ©o C 0',(7') factors through it with cofactor £ such that Q' \- (a y a'. 

Here er >- a' is the generic instantiation relation, defined in Figure 2.7, meaning 
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that any type which is an instance of a' is also an instance of a. 

Type schemes arise by quantifying a context suffix (a list of type metavari- 
ables) S over a type r, written VS.r and defined by 

V ■ .r H> T 
V(a:*,S).r h-> Va. (VS.r) 
V(d! : = f : *, S).r i->- [f/a] (VS.r) 

Any scheme a = Va7*. r can be viewed in this way, using the suffix al : * \ 
Lemma 2.9. 0 h t: (VS.r) if and only if 0, S h t:r. 

Proof. Straightforward induction on S. □ 

2.3.1 The Generalises lemma 

Recall that ? markers divide the context into localities. In the type inference 
algorithm, the metavariables that can be generalised are exactly those in the 
current locality. This relies on the following lemma, which states that a minimal 
solution to a type scheme inference problem can be found from a minimal solution 
to a type inference problem. 

Crucially, a substitution for variables in a locality cannot depend on variables 
in a 'more local' one: for example, [(3 /a, (3/(3} : a : * ? (3 : * E ■ % (3:* is forbidden. 
This allows any 9 : 0 , S E ©' , S' to be restricted to variables in 0, so that 
0| e : © E ©'■ 

Lemma 2.10 (The Generalises lemma). If 9 : 0 O , □ QjjS is a minimal solution 
of the type inference problem for t with output r, then 9 : 0 O E @i *s a minimal 
solution of the type scheme inference problem for t with output VS.r. 

Proof. If 0 : ©09 E @i 9 S then 0 : 0 O E @i by definition of E- Furthermore, 
0i h t : (VS.r) holds iff ©i § S h £ : r by Lemma 2.9. 

For minimality, suppose 9' : ©o E ©' is an information increase and Vo* 1 . f is a 
scheme such that ©' h i:V57 8 '.t;. Then ©',071** h Now 9' : @ 0 ? E ©^aTT* 4 ' 
and ©' 9 ctj : * 1 h i : v, so by minimality of the hypothesis there is a cofactor 
C : ©i 9 S C ©' 9 aTT** such that 0' = C • 0 and ©' 5 o77* * h = u : *. Then 
Clei : ©i E ©', ^' = CI01 • 0 and & h Ck (VS.r) >- VoT'. u as required. □ 

2.3.2 Transforming type assignment into type inference 

The typing rules in Figure 2.6 do not directly lead to a type inference algorithm, 
as they permit unrestricted generalisation and instantiation of type schemes. To 
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0 h t:T 



&3x:a O h a ^ r &,x:rht:v 

VAR LAM 

0 h x:r 0 h \x.t:r -t v 

0hs:r^r 0 h t:r' &°,Ehs:v 0, x: NE.v) h t:r 

APP LET 

& \- st:r 0 h let x— s in t : r 



Figure 2.8: Transformed rules for type assignment 

resolve this, an equivalent system (assigning types rather than type schemes) is 
given in Figure 2.8, where instantiation occurs only at variables, and generalisa- 
tion at let-bindings. This transformation is well known: a clear presentation is 
given by Clement et al. (1986) resulting in the rules of Figure 2.1. 

From the transformed rules, an algorithm can be constructed to match. To 
convert a rule into algorithmic form, proceed clockwise starting from the inputs 
to the conclusion. For each premise, ensure that the problem inputs are fully 
specified (by the inputs to the conclusion and the outputs of previous premises), 
inserting metavariables to stand for unknown inputs. Instead of pattern matching 
on problem outputs, ensure there are schematic variables in output positions, and 
reintroduce unification constraints as necessary. 

The type inference judgment 0 O h t : r H ©! and the scheme inference 
judgment 0 O h t : a H ©i are defined by the rules in Figure 2.9. As they are 
structural on terms, they yield a terminating algorithm. The Optimist's lemma 
means that sequential solution of problems delivers a minimal solution, and the 
Generalises lemma makes it easy to reduce type scheme inference problems to 
type inference problems. 

The A-rule now generates a metavariable for the argument type. The rule for 
application assigns types to the function and argument separately, then inserts 
an equation with a fresh name for the codomain type. 

The type inference algorithm is implemented in Appendix A. 3 (page 202). 

2.3.3 Correctness of type inference 

Since the algorithmic rules correspond directly to the transformed declarative 
system in Figure 2.8, it is easy to prove soundness, completeness and generality 
of type inference with respect to this system. 
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6 0 \~ t : a H ©i (term t in context @ 0 has inferred scheme a in context 0 ly ) 

Go, h t : r H 0i §S 



6 0 h f : (VS.r) H 0i 



INFER-GEN 



6 0 I - i : r H 61 fierm £ m context @ 0 /ias inferred type r in context Q\) 



INFER-VAR 



x : v ~\ ©o, «i : * 1 
©o, a : *, £: a h £ : r H ©!, x:a, S 

— — ■ — INFER-LAM 

©o h Xx.t :a4rH @i, S 

©ohsr-uH©! Oi I- * : ^' H 0 2 ©2, a:* h 1; = v' ->• a : * H ©3 

@o h s t : a H 0 3 

©o h s : a H ©x 0 1; x:o" h £ : r H 0 2 , S 



INFER-APP 



©o h let x = s in £ : r H @ 2 , S 



INFER-LET 



Figure 2.9: Algorithmic rules for type inference 

Lemma 2.11 (Soundness and generality of type inference). // 0 O \~ t : r H 0 1; 

then ©o E ©l z' s a minimal solution to the type inference problem for t with output 
r. Similarly, if 0 O h t : cr H ©i £/ien ©o E ©1 z' s a minimal solution to the type 
scheme inference problem for t with output a. 

Proof. By induction on derivations, using the Optimist's lemma (2.4) and Gen- 
eralises lemma (2.10). For details, see Appendix D.l (page 237). □ 

Lemma 2.12 (Completeness of type inference). 

(a) If (@o,0 is a type inference problem with solution (6 : 0 O □ Q',v), then 
©o h t : t H @i for some ©i and t. 

(b) If (0 O , t) is a scheme inference problem with solution (6 : 0 O □ 0',cr') ; then 
©o h t : er H ©i /or some ©i and cr. 

Proof. By induction on the derivation of ©' h £ : t> or ©' h i : cr' in the transformed 
declarative system of Figure 2.8. Each case corresponds directly to an algorithmic 
rule. For details, see Appendix D.l (page 238). □ 
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2.4 Elaboration, zipper style 



Elaboration is a step beyond type inference, where instead of merely generating 
a type corresponding to the source term, a representation of the term in a more 
explicit calculus is generated. This might seem excessive for the simple Hindley- 
Milner system, but for more complex type systems (particularly those involving 
dependent types) the distinction is helpful. In Chapter 7, 1 will discuss elaboration 
of a Haskell-like language. Here, to introduce the idea of elaboration, I show how 
to elaborate Hindley-Milner terms into explicitly-typed predicative System F. 
This algorithm is implemented in Appendix A. 4 (page 204). 
The grammar of System F terms is 

e ::= x \ Xx:a.e \ Aa:*.e \ e e' | er 

where A-bound variables have type annotations, and type abstraction and ap- 
plication are explicit. The type system is standard, and hence omitted; it is 
essentially a syntax-directed version of the declarative system in Figure 2.6. 

So far in this chapter, the context structure has carried the 'linguistic' context 
of term variables and type metavariables, but the type inference algorithm has 
separately managed the 'syntactic' context (the structure of the term). Variable 
bindings and the 9 marker are vestiges of the syntactic context: a variable rep- 
resents the fact that type inference is taking place under a A- or let-binding, and 
a 9 marker represents 'being under a let-defmiens'. Let me take this idea to its 
natural conclusion, identifying the syntactic and linguistic contexts into a single 
data structure that represents progress through a type inference problem. 

Huet (1997) taught us how to use a 'zipper' data structure to represent a 
position in a tree, such as a term. The path to the current location is represented 
as a list of layers, where each layer corresponds to choosing a single branch at 
a node, and stores the subtrees rooted at the other branches. McBride (2001) 
observed that the type of the zipper can be computed by differentiation, and 
further refined the structure to represent left-to-right progress through a term 
(McBride, 2008). Terms 'to the left' of the current location have been elaborated 
to a typed System F term, while those 'to the right' have not yet been visited. 
Thus the syntax of layers is given by 

£ ::= []t I (e: r)[] | Ax:r.[] | let x= [] in t | let x:a= ein [] 

where a hole [] represents the current position. Contexts are adapted to include 
layers rather than variables or 9 markers: 

9 ::= • I 9,a:* | 6,a:* := r | QJ 



27 



0 i x H> ®,a.j\*i t xa^ : r if 0 3 x:\/a^. r 

0 | si i-> e, 0* |s 

0 4 AxJ !->■ 0, a:*, Ax:a.[] J, £ 

0 4 letx = sint H> 0, let x = [] in t is 

0, Ax:i;.[], S | e:r ^ 0, EE i \x:v.e: v ^ r 

0,leta;=[] in£,H | e:T ^ 0, let x: VS.r = AS.ein [] J, £ 

9, letx:<7=e'in[],!E | e:r ^ 0, E | (Ax:cr.e) e' : r 

9,[]*,S te:r H- 9,S,(e:r)[] it 

e,(e':v)[],E |e:r ^ 6' i e' e: (5 

where Q,E, /3 :* \- v = r ^ /3 : * -\ Q' 

Figure 2.10: Elaboration as state-transformation 

Now that 0 represents the entire context of an elaboration problem, elabo- 
ration can be implemented tail-recursively as an state-transforming automaton. 
Figure 2.10 shows the elaboration algorithm. It is divided into two modes: 

• The 'downwards' mode 0 i t takes a context and a source term which is 
being elaborated. If it is a variable, control switches to the 'upwards' mode, 
otherwise it moves into an appropriate subterm by extending the context. 

• The 'upwards' mode 6 | e : r takes a context and an elaborated System F 
term with its type. It examines the context to move outwards, refocus on 
the next subterm to elaborate, then switch back to downwards mode. 

The algorithm should be invoked in downwards mode with the empty context 
and the original term to be elaborated. Eventually, if the term is well-typed, the 
upwards mode will run out of layers and terminate with the elaborated version 
of the term and its type. 

This explicit representation of partial progress through an elaboration problem 
is very useful when constraints cannot always be solved immediately, as in a 
dependently typed setting. Elaboration is no longer a left-to-right march through 
the term structure, but may involve back-and-forth refocusing as the elaborator 
finds places where progress can be made. This is the basis of the implementation 
of elaboration in Epigram. 
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2.5 Discussion 



In this chapter, I have given an implementation of Hindley-Milner type inference 
involving all the same steps as Algorithm W, but not necessarily in the same 
order. In particular, the dependency panic that seizes W in the let-rule becomes 
an invariant that the unification algorithm maintain a well-founded context. 

The algorithm is presented as a problem transformation system locally pre- 
serving solutions, hence finding a most general global solution if any solutions 
exist at all. Accumulating solutions to decomposed problems is justified simply 
by stability of solutions on information increase. The discipline of problem solv- 
ing established here is happily complete for Hindley-Milner type inference, but 
in any case couples soundness with generality. 

Maintain context validity, make definitions anywhere and only where there 
is no choice, so the solutions you find will be general and generalisable locally: 
this is a key design principle for elaboration of high-level code in systems like 
Epigram and Agda, and bugs arise from its transgression. The account given 
here of 'current information' in terms of contexts and their information ordering 
provides a principled means to investigate and repair these troubles. 

There is, however, some way to go. Algorithm W is a conveniently structural 
type inference process for 'finished' expressions in a setting where unification is 
complete. Each subproblem is either solved or rejected on first inspection — there 
is never a need for a 'later, perhaps' outcome. As a result, 'direct style' recursive 
programming is adequate to the task. If problems could get stuck, how might an 
algorithm abandon them and return to them later? By storing their context, of 
course! In Chapter 4, I will take exactly this approach to deal with higher-order 
unification problems. 

First, though, I will extend the framework in another direction: handling units 
of measure with the equational theory of abelian groups. Variable dependency 
becomes more subtle in the presence of a nontrivial equational theory, and so 
maintaining a well-founded context (in order to make generalisation straightfor- 
ward) is even more crucial. 
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2.5.1 Related work 



The idea of assertions consuming an input context and producing an output 
context goes back at least to Pollack (1990). Nipkow and Prehofer (1995) use 
unordered input and output contexts to pass information about Haskell typeclass 
inference, with a conventional substitution-based presentation of unification. 

The work of Dunfield and Krishnaswami (2013) on higher-rank polymorphism 
in a bidirectional type system, based on earlier work by Dunfield (2009), uses well- 
founded contexts that contain existential type variables (amongst other things). 
They rely on a notion of context extension in a similar way to my definition of in- 
formation increase between input and output contexts, and while their treatment 
of unification is different (since they are dealing with subtyping for higher-rank 
polymorphism, rather than let-generalisation) there are some similarities with the 
approach I have described. 

An alternative approach to generalisation, used in some ML implementations 
for the sake of efficiency, involves assigning numeric 'ranks' to type variables 
based on the number of bindings they are introduced under, then generalising 
over variables whose rank is sufficiently large. Remy (1992) implemented an 
algorithm based on counting let-bindings as part of the OCaml typechecker, and 
Kiselyov (2013) gives a clear explanation of Remy's algorithm which relates it to 
region-based memory management. Kuan and MacQueen (2007) formalised and 
compared approaches that count let- and A-bindings; they attribute the idea for 
counting A-bindings to Damas (1984). The algorithm I described manages ranks 
implicitly, by representing type variables in an ordered context, in which the 9 
marker corresponds to increasing the rank. 



30 



Chapter 3 



Unification and type inference for 
units of measure 

In the previous chapter, I described a 'problem solving' rationalisation of syn- 
tactic unification and Hindley-Milner type inference that provides a more refined 
account of dependency analysis. Term and type variables live in a dependency- 
ordered context. Problems are solved in small steps, each of which is most general 
and involves minimal extra dependency. This makes let-generalisation particu- 
larly easy: simply 'skim off' generalisable type variables from the end of the 
context, as nothing can depend on them. 

I now move on to consider one of the many extensions of the Hindley-Milner 
system, namely units of measure in the style of Kennedy (1996a,b, 2010). My 
approach to type inference gives a clearer account of the subtle issues surrounding 
generalisation in the presence of a nontrivial equational theory on types. This 
chapter is based on work presented at TFP 2011 (Gundry, 2011). A Haskell im- 
plementation of the unification algorithm described here is given in Appendix B. 

Consider this Haskell function, traditionally of type Float — > Float: 

distanceTravelled t = velocity * t + (acceleration * t * t) / 2 
where {velocity = 2.0; acceleration = 3.6} 

Kennedy (1996b) shows how to check units of measure for such terms: with 
velocity and acceleration annotated with their units (m * s -1 and m*s~ 2 ), the 
system could infer the type Float(s) — > Float(m) for the whole function. Type 
inference relies on unification, but units need a more liberal equational theory 
than syntactic equality, as m * s -1 *s should mean the same thing as m. Kennedy 
uses the theory of abelian groups. He has introduced units of measure with 
polymorphism into the functional programming language F# (Syme, 2010). 



3.0.1 A troublesome example 

Algorithm W relies on dependency analysis for let-generalisation. Using the 
occurs check to identify generalisable variables (those that are free in the type but 
not the typing environment) is problematic for the equational theory of abelian 
groups, as variable occurrence does not imply variable dependency. Later I will 
show another way of looking at this: given the equation a = r, where a is a 
metavariable and r is a type, the solution [r/a] is not necessarily most general! 
In this chapter, I will give an analysis of dependency that exposes and resolves 
the difficulties with generalisation. 

Kennedy (2010, p. 292) gives the example (notation adapted): 

Ax. let ?/ = div xin (y mass, y time), where 
div:Va:W.V/3:W.F(a*/3) ->• F(a) ^ F(/3), mass:F(kg), time:F(s). 

Here ¥{u) is a type of numbers with units u, defined in Subsection 3.0.2. If 
one adds constraint solving for units to Algorithm W with the usual occurrence- 
based let-generalisation rule, the resulting algorithm fails to infer a type for this 
term, because polymorphism is lost: y is given the monotype F(a) — > W(f3 * a^ 1 ) 
where a and (3 are unification metavariables, and a cannot unify with kg and s. 
However, if y is given its principal type scheme Va : U. ¥(a) — > F(f3 * a -1 ), then 
the term has type F(/3) (F(/3 * kg _1 ),F(/3 * s" 1 )), as described in Section 3.3. 

The difficulty is that the algorithm fails to assign principal type schemes to 
open terms because of the nontrivial equational theory on types. One way around 
this difficulty is to apply a generaliser, "a substitution that 'reveals' the polymor- 
phism available under a given type environment" 1 , due to Kennedy (1996a) and 
Rittri (1995). Such a substitution preserves types in the context (up to the equa- 
tional theory) but rearranges group variables so that the Algorithm W general- 
isation rule can be used. Calculating a generaliser is specific to the equational 
theory and technically nontrivial. It is not implemented in F#, so Kennedy's 
example does not type check: 

> fun x -> let y z = x / z in (y mass, y time) ; ; 



error FS0001: Type mismatch. 

Expecting a float<kg> but given a float<s> 

The unit of measure 'kg' does not match the unit of measure 's' 

Kennedy (1996a, p. 23) 
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Term variables 


x,y 






Type metavariables 


a, 13,1 






Kinds 


K : 


: = 


* | U 


Contexts 


0 : 


:— 


■ 9, a : k 9, a :=p : k 6, x: a 0, 


Suffixes 




: = 


■ S, a : k | S, a :—p : k 


Unit suffix 


T : 


: = 


• \ a:U 


Type expressions 


p ■ 


: = 


a p ->• p' F(p) & | 1 | p * p' | p _1 


Types 


r, v : 


: = 


a | r ->■ v | F(z/) 


Units 


v : 




a; | & | 1 | v * i/' | z/ -1 


Base units 


b : 




kg m s 


Type schemes 


a : 




p Va : /t. o" 


Terms 


t, s : 




x Ax.i | si | leta: = sini 


Statements 


J : 




ctx a : k p = p' :k t:a a >- a' J 



Figure 3.1: Syntax 
3.0.2 Extending the framework 

In this chapter I extend the unification algorithm from Chapter 2 (and hence 
type inference) to the theory of abelian groups. Mistaking occurrence for de- 
pendency will show up as the source of the difficulty described above, leading to 
a straightforward solution. With more structure in the context than just typ- 
ing assumptions, it is easier to see where generality can be lost, and the loss of 
polymorphism can be avoided in the first place instead of recovered after the fact. 

The syntax of contexts, expressions and statements is given in Figure 3.1. As 
before, a context is a list of metavariable declarations a:n, definitions a: = p : k, 
term variable declarations x : a and % markers. Now, however, metavariables may 
have kind * (a type) or U (a unit). Similarly, type schemes record the kind of 
quantified variables, and the typing and equality statements include kinds. For 
example, 

a:*,(3:U,x:{Wi:U.a ^ F(/3* 7 )) 

is a valid context. A common syntax of type expressions p has subgrammars for 
types r and units v. 

Figure 3.2 gives rules to construct a valid context and interpret variables in 
the context. These are similar to the rules for the Hindley-Milner system from the 
previous chapter (Figure 2.3, page 12), with the addition of the kind U. Types are 
extended to include a single new type F(z/) representing a numeric type indexed 
by a unit v. A real implementation would allow user-defined unit-indexed types, 
but one suffices for illustration. 
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e h ctx 



(Q is a valid context) 



a#0 

e h ctx 



«#© 

0 h p:K 



0 h cr:* 



9 h ctx 



h ctx 6,«:Kh ctx 0, a : = p : k h ctx 0, x : a h ctx 0, h ctx 



0 h u:k 



0 9 tt!K 

0 h ct : K 



0 h b:U 



0 h r:* 



(a is a well-formed scheme of kind k in Q) 
©hi;:* 9hi/:W 0, a: /t h <r: * 



9hi/:W 9hi/':W 
0 h u*i/:U 



0hF(z/):* 9hV«:K.fr:* 



0 h 
0 h v- x :U 



0 h \ :U 



Figure 3.2: Rules for context validity and well- formed type schemes 



9 : ©o E ©1 (0 zs a metasubstitution from ©o to ©ij 

0:©oE©i ©ihp:K 0:©oE©i ©ihp = 0p':re 



[]:-ES (0,p/a) :eo,a:«E6i 

9 ■ ©o E ©i 
0 : ©o,3;:o" E ©i, x:9a, S 



(9,p/a) : ©ojft^p' : « E ©i 

0 : ©o E ©i 
0:©o? E ©I, S 



0 = 0':0 O E ©i 



(19 and 9' are equivalent metasubstitutions from 0 O to ©ij 

0 = 0':© o E©i ©ihp = p':K 
= •:• E @i (0, p/a) = (0', p'/a) :© 0 ,a:/t E @i 

0 = 0 , :0 o E©i ©ihp = p':K & 1 hp' = 9p":K 
(0,p/a) = (9',p'/a):& 0 ,a: = p" : « E ©i 

0 = 0 / :© o E©i 0 = 0 / :0 o E©i 



0 = 0:© o ,x:a E ©i,:r:0(x,S 



0 = 0 :©09 E ©i ;s 



Figure 3.3: Rules for metasubstitutions 
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The updated rules for metasubstitutions are given in Figure 3.3. These are 
obvious extensions of the rules given in Subsection 2.1.2 (page 14). 

Recall that a statement J is an assertion that can be judged in a context. The 
syntax of statements from the previous chapter is extended with kind information, 
and the sanity conditions (Lemma 2.1) are updated appropriately: 

Lemma 3.1 (Sanity conditions). IfQ h J then 0 h San J, where 



San ctx 


>->■ 


ctx 






San (a: k) 


!->■ 


ctx 






San (r = v : n) 


!->■ 


t:k A 


f : 


ft 


San(t:a) 


!->■ 


cr : * 






San (cr y a') 


!->■ 


cr:* A 


a' 


: * 


San (J A J') 


!->■ 


San J 


A 


San J' 



Proof. By structural induction on derivations. □ 

The key results from the previous chapter, stability (Lemma 2.2, page 16), 
the category structure of contexts (Lemma 2.3, page 16), the Optimist's lemma 
(Lemma 2.4, page 18) and the isomorphism lemma (Lemma 2.5, page 18) apply 
to the updated notions of statement and metasubstitution without modification. 



3.1 Unification for the theory of abelian groups 

I now consider abelian group unification problems in the framework. The syntax 
of types r is extended with units of measure v given by 

v ::= 

a metavariable 

b base unit 

1 identity 

v * v' product of units 

z/ -1 inverse 

where b ranges over some set of base units, which would be user-defined in a real 
system for units of measure. Note that units of measure v are just type expressions 
of kind U, but the typing rules ensure they must belong to this grammar. 

The rules for equivalence of types and units are given in Figure 3.4: reflexivity, 
symmetry, transitivity and congruence, plus the four abelian group axioms of 
commutativity, associativity, inverses and identity. 
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6 \-p = pf:K 



0 V p:K 

eh p = P :k e hp 

0 h r = v : * 
0hT-fT'Ei;-^j':* 

0 h z/ 0 = V\ : W 
0 h z/fT 1 = i/r 1 :^ 



(p and p' are equal expressions of kind k in Q) 
0 h p = p':K 



0 I - Po = Pi : k 
0 h pi = p 2 : K 



0 h ctx 

0 9«:=/):k 



p • k 0 h po = P2 : « 



0 h a = p: k 

0 h vq = v 2 : U 
0 h i/j = u 3 :U 



0 h F(z/) = F(z/):* 
0 h 



0 h z/ 0 * z/ x = z/ 2 * z/ 3 : W 
9hi/:W 0hz/:W 



0hz/ o :W 9hi/i:W 0hz/ 2 :W 



0 h v.U 



0 h (z/ 0 * Ui) * v>2 = * {yi * v 2) 0 h f * v 1 = \:U 



Figure 3.4: Declarative rules for unit equivalence 

Let v k mean v multiplied by itself k times and z/ - ^ mean (u k ) 1 . Units have 
a normal form n v i ki % representing the product of some distinct atoms (variables 
or constants) each raised to a nonzero integer power k { . For example, the 
expression a*a*/3*l*/3*a has normal form a 3 * f3 2 . 

Consider the equation a 3 * f3 2 = 1 in the context a : U, (3 : U. As 2 does 
not divide 3, j3 cannot be defined to solve this equation, but the problem can 
be simplified by taking (3 := 7 * a~ l where 7 is a fresh variable. This leaves 
a * 7 2 = 1 in the context a : U, 7 : U, which is solved by rearranging and defining 
a := 7 -2 . Thus the solution is 7 : U, a := 7~ 2 : U, (3 := 7 * a -1 : U, and indeed 



-6 6 
7 u * 7 



1. Along the way, the least common 



a 3 * (3 2 = (7 -2 ) 3 * (7 * a -1 )" 
multiple of 2 and 3 has been calculated. 

More generally, when solving such an equation, one can ask whether a variable 
has the largest power, and if not, reduce the other powers by it to simplify the 
problem. Some notation is in order. Suppose v = Yi v i ki \ an d define: 



maxpow(z/) 

R*(f) 



max{ \k{ \ : z/ 4 - metavariable}, highest absolute variable power; 

quotient by k of every power; 
remainder by k of every power; 



1] v^ k i remk ) , 



where quot is truncated integer division and rem is the corresponding remainder. 
The point is that v = (Q fc (z/)) fc * R fc (z/) and maxpow(R fc (z/)) < k. 
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0o || T h v = 1 : U H 61 (unifying v with 1 in ©o, T results in Qi) 



©0 || T h 1/ = 1 : W H 6i 

U- TRIVIAL U-SKIP-SEMI 



9 || • h 1 = 1 : W H 6 6 0 9 || T h 1/ = 1 : W H ©19 

a i fmvfi/) 0 O || T h 1/ = 1 : U H 0i 

U-SKIP-TY 

0o, a:K || T h i/ = 1 : W H 9i,«:k 
©o || T h 1/ = 1 : U H ©i 

— — — U-SKIP-TM 

Q 0: x:a \\T \- u = 1 :U H ©1, x:a 
©o,T || • h [o/a]i/= 1 : W H ©i 

— — — U-SUBS 

©o,o;:=p: k\\T \r v = 1 :U H ©i, a:—p : « 
fc ^ 0 



0,a:W||T ha fc *z/ =1 : W H 0, T, a := z/" 1 :W 

I A; I < maxpow(z/) /3 fresh 
©o.TII/^^h^^Rfc^^l^H©! 

&o,a:U\\Tha k *u = l : W H ©1, a:=/3 * Q fc (i/) :W 

I > maxpow(z/) @ 0 || a : W h a k * v = 1 : W H @i 
©o, a: W|| • h z/ = 1 : U H ©i 



U-DEFINE 



U-REDUCE 



U-COLLECT 



Figure 3.5: Algorithmic rules for abelian group unification 

3.1.1 The abelian group unification algorithm 

In this subsection, I give a new algorithm for unification problems v = v 1 : U. The 
inverse operation means it suffices to solve problems v = 1:U. 

Figure 3.5 shows the algorithm presented as a collection of inference rules. 
Given a context ©0, T and a unit v, the judgment ©o || T h v = 1 : U H @i means 
that the algorithm outputs the context ©i such that ©i h v = 1 :U. Note that 
the rules are entirely syntax-directed (up to the equational theory for units): at 
most one rule applies for any possible initial context and unit. They lead directly 
to an implementation, which is given in Appendix B.3 (page 210). 

So how does the algorithm work? If the problem is 1 = 1, then it is solved by 
U-trivial. Otherwise, the algorithm moves back through the context, skipping 
over (meta)variables that do not occur in the problem using u-SKIP-ty or u- 
SKIP-tm, and moving through localities using u-SKIP-semi. 
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The suffix T will either be empty (written •) or contain only the unknown 
variable with the strictly largest power in v, if any. The u-REDUCE and U- 
COLLECT rules move this variable back in the context, since there is no useful 
simplification that can be applied to it. Other rules will insert the variable into 
the context when it no longer has the largest power. 

The interesting cases arise when a metavariable a, that occurs in the problem, 
in reached. This is written a k *u = 1, always meaning that a fmv(z/). Suppose 
the normal form of v is fl v i ki % ■ There are four possibilities, either: 

(1) k divides k { for all i; 

(2) v has at least one variable and \k\ < maxpow(z/) but case (1) does not 
apply; 

(3) v has at least one variable and | k \ > maxpow(z/); or 

(4) v has no variables. 

Case (1). \i k divides k { for all i, then there is some u 0 such that v = u 0 k . 
The rule u-define applies and sets a: = z/ 0 _1 : U to give 

k k & / — 1 \ k ^ -. 

a * v = a * vq = (vq ) *v 0 = 1. 

This is clearly a solution, and it is most general for the free abelian group. 

Case (2). If not, and \k\ < maxpow(z/), then the u-REDUCE rule applies 
and simplifies the problem by reducing the powers modulo k. Recall that we 
have v = (Q fc (z/)) fc * K k {y) where Qk(v) takes the quotient by k of the powers in 
v . Hence, generating a fresh variable (3 and defining a:=(3 * Qfc(z/) _1 gives 

a k * v = ((3 * Q k {iyy 1 ) k * v = (3 k * Qk{^) k * v = (3 k * Rk(v)- 

Case (3). Suppose \ k\ > maxpow(z/), so neither of the two previous cases 
apply, but there is at least one variable in v. Now k is the largest power of a 
variable, so reducing the powers modulo k would leave them unchanged. Instead, 
the u- collect rule moves a further back in the context. This rule maintains 
the invariant that T contains only the variable with the largest power, if any; the 
invariant also guarantees that T will be empty when the rule applies. 

Case (4)- If v has no variables and k does not divide the powers of the 
constants in u, then a k * v = 1 has no solution in the free abelian group. 
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3.1.2 Correctness of abelian group unification 

The problem-solving apparatus introduced in Subsection 2.1.3 carries over with- 
out change to this new setting, where the language of statements is more general. 
In particular, abelian group unification delivers minimal solutions. 

Lemma 3.2 (Soundness and generality of abelian group unification). If the group 
unification algorithm succeeds with 0 O || T h v = 1 : U H @ 1? then ®o,T C Q 1 is 
a minimal solution of v = 1 :U. 

Proof By induction on derivations, using the isomorphism lemma (Lemma 2.5). 
For details, see Appendix D.2 (page 239). □ 

Lemma 3.3 (Completeness of abelian group unification). If v is a well-formed 
unit of measure in 0 O; and there is some 9 : 9 0 C 0' such that ©' h 6 v = 1 :U, 
then the algorithm produces ©i such that ©o || • h v = 1 : U H ©i. 

Proof. A suitable metric shows that the algorithm terminates. Completeness is 
by the fact that the rules cover all solvable cases and preserve solutions: if no 
rule applies then the original problem can have had no solutions. This occurs if 
a constant is equated to 1 (e.g. kg * s = 1) or there is one variable and its power 
does not divide the power of one of the constants (e.g. a 2 * kg = 1). For details, 
see Appendix D.2 (page 239). □ 



3.2 Unification for types with units of measure 

Having developed a unification algorithm for abelian groups, I now extend type 
unification to support units of measure, calling group unification from Section 3.1 
as a subroutine to solve constraints on units. As in the type unification algorithm 
of the previous chapter (Figure 2.5, page 21), there are two kinds of rules: 

• 'Unify' steps start the process: given an input context 6 0 and well-formed 
types r and v , the judgment 6o \~ r = v : * H 6i means that the unification 
problem r = v.* is solved with output context 9i. 

• 'Instantiate' steps handle flex-rigid unification problems: 2 given a context 
0o, 5, a type metavariable a in 0 O and a well-formed non-variable type r 
over 0 O , S, the judgment ©o|S h a = r : * H ©i means that the problem 

2 Recall that a flex-rigid problem is to unify a variable and a non- variable expression; a 
flex-flex problem has two variables and a rigid-rigid problem has two non-variables. 
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a = r : * is solved with output context 0i. The context suffix S collects 
metavariable declarations that r depends on but that cannot be used to 
solve the problem. 

Compared to the previous chapter, the language of types (of kind *) now 
includes a single new type F(z/) of numbers parameterised by units. I therefore 
add a type unification rule that invokes abelian group unification: 

6 0 || • h u 0 * z/r 1 = 1 : U H 6i 

UNIT 

0 O h F(z/ 0 ) = F(l/ 1 ) : * H 0! 

Now suppose the algorithm is used to solve F(/3 0 * — > ot = F(/?o) — >■ F(/3i) 
in the context (3o : U,a : *, (3i : U. First the constraint F(/5 0 * = F(/3 0 ) : * 
is reduced to (3 0 * j3i = /?o : W by unit, and this is solved by group unification 
(Section 3.1) to give j3o : U, a : *, j3± ■— 1 : W. Then the constraint a = F(/?i) is 
solved to give (3 0 : U,a: = ¥(l) : : W 

Do the rules in Figure 2.5 extended with the unit rule give a correct unifica- 
tion algorithm for the extended type system? The unification algorithm should 
be sound and complete, as the new algorithmic rule corresponds directly to the 
declarative rule, but generality fails. Most general unifiers are needed for com- 
pleteness of type inference, so something had better be done. 

3.2.1 Loss of generality and how to retain it 

Suppose the algorithm is used to solve the constraint a = F(/3 0 * (5i) in the 
context a : * , /3 0 : U, f3i : U. As the rules stand, this flex-rigid problem is solved 
by moving (3 0 and (3\ into the previous locality, and defining a resulting in the 
context (5q : U, /3± : U,a := F(/3 0 * Pi) ■ *? • However, another solution exists, 
namely 7 : U, a := ¥(7) : * , f3 0 : U, fi\ := /^o" 1 * 7 : W, where 7 is a fresh group 
variable. This solution is more general because f3 0 is still local (it has not been 
moved past the , marker). Why did the algorithm fail to find this? 

The trouble is that, to solve a flex-rigid constraint, the variable need not 
be syntactically equal to the type: units need be equal only up to the theory 
of abelian groups. The property that equivalent expressions have the same sets 
of free variables 3 holds for the syntactic theory and some other useful theories 
(Remy, 1992) but does not hold for groups. For example, the equation a * a^ 1 = 1 

3 This property is sometimes called regularity in the literature, but I avoid this term because 
it means too many different things in other contexts. 
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has a free on the left but not the right. Thus variable occurrence does not imply 
dependency. The occurs check in the unification algorithm is overly syntactic. 

To solve this, a flex-rigid constraint can be decomposed into a constraint on 
types, with fresh variables in place of units, and additional constraints to make 
the fresh variables equal to the units. A rigid type decomposes into a 'hull', 
or 'type skeleton', that must match exactly, and a collection of constraints in 
the richer equational theory. Similar techniques are used for type inference in 
annotated type systems (Nielson et al., 1999, §5.3.2). 

In the example, the constraint a = F(/3 0 */3i) decomposes into two constraints 
a = F(7) : * A 7 = f3 0 * (5\ : U in the context a : * 9 /3 0 : U, (5\ : U, 7 : U. Solving 
the first constraint gives 7: U, a:=F(7) : * , f3 0 : U, (5\ : U, and solving the second 
yields the most general solution f3' :U,a :=F(7) : * , /3 0 : U, (5\ := (/3o _1 * 7) 'U. 

Committing only to the hull is the minimal commitment entailed by the equa- 
tion, as far as the equational theory on types goes. One could even go further 
and solve every flex-rigid equation one constructor layer at a time, so a = r — > v 
would be solved by a = /3 0 — > j3i A f3 0 = r A f3i = v. 

The rules from Figure 2.5 (page 21) can be modified to maintain the invariant 
that the only unit metavariables a flex-rigid problem depends on (i.e. those in the 
rigid type r or suffix S) are fresh unknowns. Unit metavariables are never made 
less local by collecting them in S as dependencies. Type unification does not 
prejudice locality of unit metavariables: they must be left for group unification. 
The rule 

t non-variable 0o | • h a = r : * H Q 1 

INST 

0 O h a = r : * H Q 1 

is replaced by 

t non- variable fresh 

e 0 \W i ^-o ( = r{j i i }:*-\e 1 

0! h fr^Vi-.U* H 0 2 

INST 

e 0 haEr{r'}:He 2 

where t{T71 1 } is the hull of the type r, parameterised by a vector of units (so 
¥(u 0 ) ->F(i/i) has hull F(_) ->F(_) andr{a 0 ,ai} = F(a 0 ) -»• F(«i)). Vectors 
of equations are solved one at a time, threading the context: 

Ooll-^/WcT^l^H©! ... 0 n _x || • h /?„_!* z/^r 1 = 1 : U H 0 n 

CONJ 

0 O h p 0 = vq.U A ... A /? n _i = y n -i:U H 0 n 
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The updated rules are given in Figures 3.6 and 3.7. Apart from the addition 
of the unit rule, and the modification to the inst rule, the only changes are 
minor generalisations, such as changing rules to work with an arbitrary kind k, 
rather than just *. Again, symmetric variants of the inst and define rules have 
been omitted. The implementation is given in Appendix B.4 (page 212). 

Similarly to Definition 2.1 (page 20) in the previous chapter, the instantiation 
part of the algorithm expects a number of conditions to be satisfied: 

Definition 3.1. The quadruple (O 0 ,H,a, r) satisfies the input conditions if 

• O 0 \~ a : * where a is a metavariable, 

• ©o, S h r : * where r is not a metavariable, 

• S contains only metavariable declarations /3:k with j3 G fmv(r), and 

• if W(v) is a subterm of r then v — (3 for some f3 with S 3 (3:U. 

The crucial addition, maintained by the new inst rule, is the last condition. 
This is necessary for generality, as it ensures that every unit metavariable in S 
is a true dependency of r, and completeness, as it ensures that S captures all 
the unit metavariable dependencies of r, so the algorithm will not encounter an 
unexpected unit metavariable dependency and get stuck. 

3.2.2 Correctness of type unification 

With the above refinement, type unification gives most general results. 

Lemma 3.4 (Soundness and generality of type unification). 

(a) If t = v : * ~\ Q\, then 6o E @i is a minimal solution ofr = v:*. 

(b) IfQo |S h a = r : * H 6i ; then 6o,S C 6i is a minimal solution of a = r:*. 

Proof. Proceed by induction on the structure of derivations, as in Lemma 2.6 
(page 22). The majority of the cases are similar to the previous proof, but 
the unit rule is new, the inst rule has been modified. The inst-skip-semi 
rule requires a more subtle generality proof, in order to verify that instantiation 
moves only genuine dependencies. The input conditions ensure that units always 
occur in the form ¥(a), so it is obvious that a is a dependency. For details, see 
Appendix D.2 (page 240). □ 
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©o I - r = v : * H Q 1 (unifying r with v in 0 O results in Q±) 



<d 0 \- t 0 = v 0 : * -\ Qi 0i h T\ = t>i : * H 62 

DECOMPOSE 

0 O I- [Tq ->■ Ti) = (uq ->• : * H 0 2 

0 O II • h u 0 * z/r 1 = 1 : U H 0i 



0o h F(z/ 0 ) = F(i/!> : * H 0i 
r non- variable ft* fresh 



UNIT 



0 O I h a = r{ ft'} : * H 0 X Q 1 h ft = ^rW* H 0 2 

INST 



0 o ha = r{^ J }:*H0 2 



a + (3 

IDLE DEFINE 



Q,a:*\~a = a:*-\Q,a:* Q, a:* \- a = /3 : * -\ Q, a:= /3 : * 

0 O h [p/ 7 ] a = [p/ 7 ] /3 : * H 0i 



0 0 , 7 : = p : k\- a = /3 : * H 0i, 7 :=p : ft 

0 o ha = /3:*H0i a ^ 7 /3 ^ 7 

0 O , 7 : ft h a: = /3 : * H0 1)7 :/t 



SUBS 



SKIP-TY 



0 o ha = ^:*H0i 0 o ha = /3:*H0i 

■ — SKIP-TM ■ — — SKIP-SEMI 

0 O , x:a h a = (3 : * H 0i, x:a <d 0 ° 9 \- a = (3 : * -\ 



eoll-h^o^l/Q-^l^Hei ... 0n-l || • l~ fti-1 * ^n-l 1 = 1 '. li ~\ 0 n 

©o h /3 0 = vq:U A ... A fti-i = v n _i :W H 0 n 
Figure 3.6: Algorithmic rules for type unification (part 1) 
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@o | £ h a = r : * H ©i (instantiating a with r in 6 0 , S results in 

a fmv(r) 



©o,o;:*|SI-Q; = r:*H©o,S,a:=r:* 
So.SI-^aE^riHe! 



INST-DEFINE 



INST-SUBS 



INST-DEPEND 



9o,/?:=p : k|S h a = r : * H 0i,/!i:=p : k 

©o|^:*,Sho; = r :*H©i a ^ f3 (3 G fmv(r) 

0 O , |S ha = r : * H 0! 

9 0 |SI-a = r:*Hei £ £ fmv(r) 

: INST-SKIP-TY 

©o, p '■ k \ a = r : * -\ Qi,p:K 
@o|Sho; = r:*H0i 

INST-SKIP-TM 



@o,x:a \ 3\- a = r : * -\ @i,x:a 
6o|Sha = r:*H6i 



6q9 |Sha = r: H6 



INST-SKIP-SEMI 
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Figure 3.7: Algorithmic rules for type unification (part 2) 
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Lemma 3.5 (Completeness of type unification). 

(a) If the types v and r are well- formed in @o and there is some 9 : Q 0 E 0' with 
0' h 9 v = 9 t:*, then unification produces 0i such that ©o \~ v = r : * H 0! . 



(7>,) Moreover, if 9 : ©o,2 □ 0' is snc/i that 0' h #a = 9t : * and the input 
conditions (Definition 3.1) are satisfied, then there is some context ©i such 
that 0o|Sho; = r:*H0i. 

Proof. Termination of the algorithm can be established via an appropriate or- 
dering. Proceed by structural induction on the call graph, observing that each 
rule preserves solutions, and that all (potentially solvable) cases are covered. 
Completeness of appeals to group unification follows from Lemma 3.3. For more 
details, see Appendix D.2 (page 241). □ 



I have given a unification algorithm for types containing units of measure in 
Section 3.2, and this extends to a type inference algorithm for the corresponding 
type system. Given the new types, amended unification algorithm and the ability 
for type schemes to quantify over variables of kind U, no changes to the type 
inference algorithm from Section 2.3 are required. 

Generalisation is easy and there is no need to complicate the type inference 
algorithm to deal with units of measure. The initial context can be extended 
with constant terms that use the new types. Moreover, thanks to the refinement 
of Section 3.2.1, the algorithm copes naturally with the problematic term from 
Subsection 3.0.1, correctly inferring its most general type. Recall the example: 



div:Wa:U.Wf3:U.¥(a* (3) F(a) -> F(/3), mass:F(kg), time:F(s). 
At the crucial point when the type of y is being inferred, the situation is 

a : *, x: a ° 9 (3 0 : U, Pi : U h div x:¥((3 0 ) ->• F(/3i) subject to a = F(/3 0 * pi), 

where a is an unknown fresh type variable standing in for the type of x. The 
constraint decomposes into two simpler constraints a = ¥(-y) :* A 7 = (3q * (3\ :U 
with 7 a fresh unit metavariable. These can be solved one at a time to give the 
solution 7 : U,a:= F(7) : *, x : a , f3 0 : U, Pi := (7 * /3 0 _1 ) : U. Generalising by 



3.3 Type inference for units of measure 




where 
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'skimming off' type variables in the locality gives the type scheme 



7 :W,x:F( 7 )h !/ :VA:W.F(A>^F(7*r 1 ) 1 
which is principal. Type inference for the whole term succeeds, giving the type 

F< 7 ) (F(7*kg- 1 ),F( 7 *s- 1 )). 

3.4 Discussion 

I have shown how to combine abelian group unification with syntactic unification 
while carefully tracking dependencies in a structured context, so generalisation 
is straightforward. Crucially, contexts capture an appropriate notion of locality, 
so a local solution is more general than a global one. The algorithms presented 
here solve unification problems by making gradual steps towards a solution, and 
it is comparatively easy to check that each step is sound and most general. A key 
point is that flex-rigid equations a = t cannot always be solved by substituting 
r for a, given a nontrivial equational theory. Instead, r decomposes into a 'hull' 
(the outer structure that a must match exactly) and a collection of constraints 
in the equational theory. 

This technique can be applied to other equational theories and more advanced 
type systems. The integers are an abelian group under addition, so the work in 
this chapter could be combined with the account of elaboration in Chapter 7 to 
elaborate types indexed by integers. 

In this chapter I have been following the trail that Kennedy blazed, in the 
representation of units of measure using a free abelian group, the observation that 
unification has unique most general unifiers in this case, and the application of 
these properties to type inference. To extend the technique to less convenient type 
systems, I will need to deal with problems that cannot necessarily be solved on the 
first attempt. In the next chapter, I will examine higher-order unification, which 
is useful for elaborating higher-rank and dependent types. 'Pattern unification' 
as introduced by Miller (1992) provides a solid starting point, but here an explicit 
representation of postponed unification problems will be essential, because not 
all higher-order unification problems fall into the fragment that can be solved 
immediately. 
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3.4.1 Related work 



Many authors have proposed designs for systems of units of measure. I have 
followed Kennedy's design, using integer powers, so units form an abelian group. 
Some authors use rational powers (giving a vector space), including Rittri (1995), 
who discusses the merits of both approaches. Chen et al. (2003) give a useful 
overview of work on units, and describe an alternative approach using static 
analysis. 

Several impressive implementations of units of measure use advanced type 
system features such as GHC Haskell extensions (Buckwalter, n.d.) and C++ 
templates (Schabel and Watanabe, 2013). However, the difficulty of expressing a 
nontrivial equational theory at the type level means that they are complex, have 
limited inference capabilities and tend to expose the internal implementation in 
unfriendly error messages. Making units a type system extension, as in F#, 
results in a much more user-friendly system. 

Remy (1992) extends the ML type system with other equational theories for 
which variable occurrence does imply dependency (specifically excluding abelian 
groups). As discussed in the previous chapter, his unification algorithm achieves 
easy generalisation by tracking the 'ranks' at which type variables are introduced. 

Sulzmann et al. (1999) propose a version of the HM(X) framework for repre- 
senting type systems in constraint form, which avoids the generalisation problems 
discussed in this chapter by allowing constraints to be quantified over instead of 
solving them immediately. This is a very useful technique, although it is practi- 
cally desirable to solve unification constraints as soon as possible (in the interests 
of efficiency and good error reporting). 
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Chapter 4 

Miller pattern unification 



Higher-order unification is the problem of finding definitions for metavariables in 
order to solve an equation between two A-calculus terms. It extends first-order 
unification, as discussed in Chapter 2, in that 

• terms have a binding structure, so unifiers must respect variable scope: e.g. 
Xx.a ~ Xx.x can only be solved by a := x if the metavariable a may depend 
on the bound variable x; and 

• terms have a nontrivial equational theory, given by the (3- and 77-rules: 1 for 
example, Xx.x Xx.Xy.a x y can be solved by a := Xz.z since 

Xx.Xy.(Xz.z) x y =3 Xx.Xy.xy = v Xx.x. 

Given these complications it is perhaps unsurprising that full higher-order unifi- 
cation is undecidable (Huet, 1973). Most general unifiers do not necessarily exist 
and terms may have infinite sets of unifiers, though they can be generated by a 
semidecision procedure (Huet, 1975). Miller (1992) observed that a useful sub- 
problem, unification in the pattern fragment, is decidable and has unique most 
general unifiers if they exist at all. Here metavariables must be applied to spines 
of distinct bound variables, so Xx.x Xx.Xy.a x y is included but Xx.a x x ~ Xx.x 
is not; observe that the latter has two incompatible solutions a := Xx.Xy.x and 
a := Xy.Xx.x. Equations that look like definitions, are definitions: axl 1 ~ t can 
be solved by a := X Ti l .t. An application to variables determines a metavariable 
fully, while an application to other terms determines it only in part (for example, 
a (Xx.x) ~ t cannot easily be solved). 

1 One can consider /3-equality alone, but for the purposes of this chapter I will need both. 



Dependently typed programming languages rely on higher-order unification 
for elaborating source programs, much as Hindley-Milner type inference makes use 
of first-order unification. Languages with a kernel type theory, such as Coq (Coq 
Development Team, 2013) and Epigram (McBride and McKinna, 2004), do not 
need unification in the kernel, but they depend on it to elaborate human-readable 
syntax. Likewise, Agda (Norell, 2007) uses higher-order unification for pattern 
matching and implicit argument synthesis. During the elaboration of a source 
language program, metavariables are inserted to stand for function arguments 
that the user has omitted, and unification problems arise when types do not match 
exactly. Elaboration will be considered in more detail in Chapter 7. Dependent 
types naturally lead to higher-order unification problems, since functions express 
dependency (for example, consider solving for a and j3 in Tlx: a. (3 x ps T). 

Programmers in a dependently typed language need to grasp the capabilities 
of unification if they are to become productive users of the language. Knowing 
what to omit, because the machine can reconstruct it for you, is a crucial aspect 
of writing comprehensible programs. 

Languages with simple pairs or S-types (pairs in which the type of the second 
component may depend on the value of the first component) motivate extending 
the pattern fragment to projections. For example, consider a hd x ~ x where 
postfix hd is first projection. This does not fall in the original pattern fragment 
but has most general solution [(Xx.x, j3)/a] where j3 is a fresh variable. 

For many applications, the static pattern fragment is overly restrictive: one 
often has multiple constraints, some of which fall into the fragment and some of 
which do not, but solving one constraint may make bring others into pattern form. 
This leads to 'dynamic' pattern unification, where non-pattern constraints may 
be postponed in case they are solvable later. For example, given the constraints 
a x ~ j3 and ay y ~ t, the latter is not in the pattern fragment, but after solving 
the first constraints via a := \x.f3 the second becomes /? y ~ t. 

Dynamic treatment of constraints is necessary even in first-order problems, 
because there is no fixed positional order of constraint solving that will work in all 
cases. For example, consider the problem (a + f3, a) p=j (3,0) where a and (5 are 
natural number metavariables. If an algorithm always unifies the components 
of pairs from left to right, it gets stuck on the constraint a + (5 ps 3. On the 
other hand, after solving a ~ 0, the first constraint computes to the much easier 
j3 ps 3. 2 The Coq proof assistant, used as a dependently typed programming 
language, suffers from exactly this problem. 

2 Always unifying from right to left is no better: what if we swap the pair's components? 
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In this chapter, I present a dynamic pattern unification algorithm for a lan- 
guage with full-spectrum dependent types including S-types. It includes: 

• the use of heterogeneous equality constraints to maintain typing discipline; 

• a novel notion of 'twin variables' used to simplify problems heterogeneously 
when a variable must be assigned two intensionally distinct types, as in 
(Xx.s:Ux:A. B) « (Xx.t:Ux:S. T); 

• an extension of the context structure from previous chapters, suitable for 
managing dependency and partial progress on unification problems; and 

• the demonstration of a minimal-commitment unification algorithm that 
makes it easy to deliver most general unifiers, when they exist. 

In Section 4.1, I describe the type theory in which I will work. I give the algo- 
rithm in Section 4.2, with a high-level specification via rewrite rules. Correctness 
properties of the algorithm are proved in Section 4.3, although termination is 
problematic. Finally, some concluding remarks form Section 4.4. A Haskell ref- 
erence implementation of the algorithm is given in Appendix C (page 214). 

4.0.1 Related work 

Since Huet's seminal work on higher-order unification for simply typed A-calculus 
(Huet, 1975), many people have sought to extend it to dependently typed calculi, 
in particular the Edinburgh Logical Framework (Harper et al., 1993), also known 
as A n -calculus. Elliott (1990) and Pym (1992) both demonstrated semidecision 
procedures for unification based on Huet's, using the fact that dependencies are 
erasable in the LF to give notions of 'type similarity' (in Pym's terminology) that 
relate the types of terms being unified. Brown (1996) studied the metatheory of a 
variant of A n -calculus with type similarity, and used this to re-present unification 
as a system of reduction rules. 

In contrast to Huet-style semidecision procedures, which generate a sequence 
of unifiers, Miller's pattern unification (Miller, 1992) finds most general unifiers 
when they exist, but applies only to a fragment. Duggan (1998) generalised the 
pattern condition to support System F u with simple product types. Reed (2009a) 
described how to apply dynamic pattern unification to LF. He introduced 'typing 
modulo' (discussed in Subsection 4.0.3) as a neat simplification of type similarity 
and similar invariants used to handle the complications of type dependency. Abel 
and Pientka (2011) extended Reed's algorithm to support A ns -calculus (LF with 
S-types) and implemented it for the Beluga language. 
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Separately, higher-order unification algorithms have been developed for lan- 
guages based on Martin-L6f Type Theory, such as Agda, or the Calculus of Con- 
structions, such as Coq. Here, unlike in LF, full-spectrum dependency means 
that types may be recursively defined and computed from terms by large elim- 
ination. Thus term dependencies in types are not erasable to produce simple 
non-dependent types, and the work on unification for LF is not immediately ap- 
plicable. Pfenning (1991b) extended pattern unification to the Calculus of Con- 
structions, characterising exactly those terms that fall in the pattern fragment 
statically; hence the types can always be unified first. 

This chapter builds on the work of Reed (2009a) and Abel and Pientka (2011) 
to describe unification for a full-spectrum dependent type theory, rather than LF. 

4.0.2 Intensional vs. extensional equality 

Definitional equality in an intensional type theory is the /3<5?7-convertibility re- 
lation, written s = t. For a strongly normalising theory, it is easy to test in 
a type-directed fashion, by checking that s and t have the same normal form 
(up to a-equivalence) after computation (/3-reduction), expansion of definitions 
((5-expansion) and ^-expansion. It is intensional in the sense that extension- 
ally equal terms need not be definitionally equal: for example, s = Xx.it and 
t = Ax. if x then x else tt are equal on all boolean inputs, but s ^ t. 

Extensional type theories typically add a propositional equality type Id^si 
of proofs that s and t are equal, together with the equality reflection rule 

T h u : \d T st 

s=t: T 

that embeds arbitrary proofs into the definitional equality. Extensional equality 
is undecidable in general: given a description of a Turing machine M, consider 
the function that maps a natural number n to the boolean indicating whether 
M halts within n steps. One cannot hope to decide whether this function is 
extensionally equal to the constantly false function! 

The unification algorithm I will describe finds solutions up to the intensional 
definitional equality, not extensional equality. Finding solutions up to extensional 
equality involves proof search and most general solutions are not (intensionally) 
unique. For example, if a : M — > B is a metavariable and x : M is a variable, 
the problem a x ~ tt has unique solution Xx.it up to definitional equality, but 
solutions up to extensional equality include Ax. if x then x elsett and other terms. 
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Most type theories have some internal notion of propositional equality in which 
equations can be proved, such as the identity type in Martin-L6f Type Theory 
(Martin-L6f, 1984), which reflects the definitional equality as a type, or coercion 
types in System F c (Sulzmann et al., 2007), where equality evidence is explicit 
but in a different syntactic category to terms. Given a type theory with a suffi- 
ciently expressive propositional equality, one could represent unification problems 
as types, and unification could deliver terms (equality proofs) as evidence. How- 
ever, in this chapter I prefer to make fewer assumptions about the object type 
theory, emphasising that the work is more widely applicable. 

4.0.3 Heterogeneous equality 

Given the problem Tlx : A. B rs Tlx : S. T, a reasonable step to take is to simplify 
it to A a* S, B T. However, at this stage B and T expect different types for x, 
as the equation between A and S may not be solved immediately. This shows the 
need for a heterogeneous notion of equality, in an intensional setting: it permits 
the expression of equations where the two sides belong to provably (extensionally) 
equal but not definitionally (intensionally) equal types. Such equations would be 
homogeneous in an extensional setting. In general, unification must formulate 
and solve equations between vectors of terms in a telescope, where unifying the 
first ra — l terms will make the types of the n th terms equal. The unification 
algorithm will maintain the heterogeneity invariant, that every heterogeneous 
equation involves types whose equality is implied by preceding equations; thus 
solutions will always be homogeneous. 

Reed (2009a) elegantly dealt with heterogeneity using a weaker invariant on 
homogeneous equations, typing modulo, which requires that the two sides be well 
typed up to the equational theory of the constraints yet to be solved. However, 
this means that if there are unsolved constraints left when the algorithm ter- 
minates, then some solved metavariables may be ill typed, up to the definitional 
equality. This is problematic for elaboration of a full-spectrum dependently typed 
source language, where typechecking is interleaved with unification, so unification 
must not create ill-typed terms. Norell (2007, Ch. 3) shows how ill-typed solutions 
to metavariables can lead to non-normalising terms and hence non-terminating 
elaboration. The algorithm I present avoids this difficulty by ensuring that all 
outputs are well typed, provided it is given well typed input. 
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Variables 


x, y, z, X, Y, Z 








Metavariables 












Terms 


s, t, S, T 


: = 


n Xx.t c Tlx: S. T 




S. T | (s,i) 


Constructors 


c 


: = 


Set | Type B | tt | ff 






Heads 


h 


: = 


X j X j X j 






Evaluation contexts 


e 


: = 


• e t | e hd | e tl | if ( x 


t) e 


st 


Neutral terms 


n 




h-e 




Metacontexts 


e 




■ ©, a: T | ©, a :— t : 


T | 




Contexts 


r, a 




• \ T,x: T | r,x:5tr 






Substitutions 


5 




• \ S, t/x \ 5, (s, t)/x 






Metasubstitutions 


o, C 




■ 9, t/a 






Problems 


p, Q 




T | JL | P A Q | p 


i (t 


T) | Vx:S.P 



X1x:S\T.P 



Figure 4.1: Syntax 

4.1 Back to basics 

The type theory for which I will describe pattern unification essentially consists 
of Martin-L6f Type Theory with II and S-types, a type of booleans B and one 
small universe Set. The only form of dependency is a type- level if-expression, 
allowing large elimination. It is based on Kipling, a theory described by McBride 
(2010a) with a model construction in the dependently typed language Agda. 

In this section, I introduce the representations of terms and contexts, give 
the typing rules, discuss the use of 'twins' for representing variables with two 
provably equal types, explain the role of substitutions, and recall some standard 
metatheoretic properties. These concepts will be used in Section 4.2, where I 
specify the unification algorithm. 

4.1.1 Term representation 

The syntax of terms is given in Figure 4.1. Types and terms live in a single 
syntactic category, though I will typically write s, t, u or v for terms and S, T, 
U or V for types. A neutral (stuck) term n is represented as h ■ e where h is a 
head and e is an evaluation context, generalising the spine form of Cervesato 
and Pfenning (2003). This allows easy access to the head, which may be a 
variable x, y, z or a metavariable a, (3. The accents on variables will be used 
to deal with heterogeneity, as discussed in Subsection 4.1.4. Evaluation contexts 
include applications, if-expressions and projections from E-types (written postfix 
hd for first projection and tl for second projection). Embedding neutral terms 
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s • e JJ. t 



( redex s ■ e reduces to normal form t ) 
s ■ e JJ. Xx.u [t/x]u]]rV s • e JJ. (to, ti) s • e JJ- (to, ti) 



t-»tyt 



s ■ (et) JJ. v 



s ■ (e hd) JJ- to s • (e tl) JJ. ti 



s ■ e JJ. tt 
s- (if( x .T)e toti) JJ. to 



s • e JJ. ff 
s ■ (if^.r) e t 0 ti) JJ. 



s • e JJ. n 
s ■ (e • e') JJ. n ■ e' 



(applying substitution 5 to normal form t reduces to t') 



6(h) = s 5eij,e' s-e'Xyt 



SStyS' 5 T J| T' 



<5cJ|c 

s s jj. 5' <5 r J| r' 



6{Uy:S. T)^Uy:S'. T 5(^y:S. T)$Y,y:S'. T 



8tXyt' 
5(XyJ) I Xy.t' 

6(t,u) I (t',u') 



5* J| 



( applying substitution 5 to evaluation context e reduces to e') 

5 e JJ. e' S t \j, t' <5 e JJ. e' 5 e JJ. e' 

5 (e t) JJ- e' t' S (e hd) JJ. e hd 5 (e tl) JJ. e tl 

<5eJ|e' 5 T JJ. T' 8t$t' 5 u J| «' 
5(if( 2/ . T )e £ u) JJ. if( 2/ .T') e' £' u' 



(5(x) 


!->■ 


t 


where t/x £ 5 


5(f) 


!->■ 


s 


where (s,t)/x G 5 




!->■ 


t 


where (s,t)/x G S 


5(a) 


!->■ 


a 






!->■ 


X 




69(a) 


!->■ 


t 


where t/a G 9 



Figure 4.2: Hereditary substitution 
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into normal forms is written n, though the underline will sometimes be omitted. 
Evaluation contexts can be composed in the obvious way, written e • e'. 

In this representation, terms t are always /3-normal but not necessarily r/-long. 
This is possible thanks to hereditary substitution (Watkins et al., 2003), defined 
in Figure 4.2. Here s • e JJ- t means s ■ e reduces to t, while 8 1 JJ- t' or S e JJ- e' means 
applying the substitution 5 to the term t or evaluation context e results in the 
normal form t' or e' respectively. These reduction relations are all functional, and 
are decidable for well-typed inputs. Moreover, projecting from any term (even 
if it is ill typed), or applying any term to a variable, will terminate; I make use 
of this in the definitional equality rules for functions and pairs. In the rules, 
I sometimes write redexes in terms, where formally there should be additional 
premises referring to the reduction relations. 

A telescope A = (xf. T^) is a vector of name bindings with corresponding 
types, where each type may depend on the variables x 0 , . . . , x i _ 1 . The single 
binding notation ITx : S. T or Xx.t generalises in the obvious way to bind a 
telescope IIA. T or AA.l Similarly hA is the application of the head h to the 
variables bound in A. The non-dependent II and E, where x does not occur in 
the codomain T, are written S — » T and S x T respectively. 

4.1.2 Contexts and unification problems 

In the style of contextual type theory (Nanevski et al., 2008), I separate the 
metacontext 6, which contains metavariables and unification problems, from the 
context T, which binds variables. In terms of mixed quantifier prefixes, this 
amounts to maintaining an 3V-prefix, a normalised representation of contexts 
in which the existential quantifiers (metavariables) appear before the universal 
quantifiers (variables). This avoids the need for Miller's explicit 'raising' step. 

Unlike contextual type theory, however, I do not represent metavariable con- 
texts explicitly: metavariables simply have Il-types. This identification of the 
object language function space with parametrisation in the metalanguage is con- 
venient, when the object type theory is sufficiently expressive, but is not essential. 
In Chapter 7, where the type language lacks first-class higher-order functions, I 
will make use of parametrised metavariables instead. 

A context V is a telescope that may also include a novel form of binding, to deal 
with heterogeneous hypotheses for unification problems (see Subsection 4.1.4). 
The set of variables bound by a context is written vars(r). 

A metacontext 0 is a list of metavariables a, each carrying a type and possibly 
a definition, and unification problems P. Scope is managed according to the 
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invariant that each entry depends only on those that precede it, and in terms, 
metavariables are explicitly applied to all the variables they may depend upon. 

Unification problems include heterogeneous equations (s : S) ~ (t : T), uni- 
versally quantified variables, truth, falsehood and conjunctions. For brevity, I 
will sometimes omit the types in equations, writing s ~ t. In Subsection 4.0.3 I 
remarked on the need for the terms being unified to have different types. 

For example, 

a:B->B,/3:B,?Va;:B->B. (a(xp):M) « (x/3:B) 

is a valid metacontext, which declares metavariables a and j3 and has a single 
unification problem with parameter x. 

A substitution 5 or metasubstitution 6 contains terms with which to re- 
place variables or metavariables from a context or metacontext. The identity 
(meta) substitution is written i, and a substitution written as a finite map (such 
as [s/x]) implicitly acts as the identity on all other variables. I will sometimes 
write t{s} instead of [s/x] t where the choice of free variable x is obvious, or to rep- 
resent a term that includes s as a subterm. Typing rules for (meta) substitutions 
are given in Subsection 4.1.5. 

4.1.3 Typing rules 

The typing rules are given in the following figures. They define judgments for 
well-formed metacontexts, contexts and problems; for definitionally equal (35- 
normal terms; and for true propositions of the unification logic. In the usual 
bidirectional style (Pierce and Turner, 2000), the definitional equality judgment 
is split in two: there is one judgment for normal terms, where a type is given as 
input, and one for neutral terms, where a type is produced as output. In the in- 
terest of brevity, definitional equality is treated as a partial equivalence relation, 
with the typing judgment being the diagonal of equality. Crucially, the defini- 
tional equality, and hence typechecking, is decidable 3 using standard techniques 
(Coquand, 1996; Chapman et al., 2005; Loh et al., 2010). Appendix C.3 (page 
221) gives a Haskell implementation of the typechecking algorithm. 

In particular, the theory is well-founded because there is only one universe 
Set : Type, and Type itself is not a well-typed term. (Formally, T : Type should 
be considered a separate judgment to t: T.) 

3 Strictly speaking, it is not possible to decide well-formedness because it depends on the 
truth of propositions in the unification logic, which is not decidable. In practice, this does not 
matter, because I can always assume that the algorithms are given well-formed inputs. 
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The judgments 0 h mctx, 0 | T h ctx and 0 | T h P wf , denned in Fig- 
ure 4.3, mean respectively that 0, T and P are well-formed. Contexts and 
metacontexts must bind distinct variables, as x#T means x is fresh for T and 
a#0 means a is fresh for 0. Note that well-formed problems are required 
to satisfy the heterogeneity invariant: for (s : S) « (t : T) to be well-formed, 
(S : Type) (T:Type) must be true in the unification logic. 

The judgment Q \ T \- T 3 s ^ u ^ t, defined in Figure 4.4, means that 
s and t are definitionally equal terms checked at type T, with 77-long standard 
form u (regarded as an output). This ternary presentation of equality is novel, to 
my knowledge. It is often useful to pick a canonical representative when working 
up to an equivalence; for example, it makes the admissibility of symmetry easy 
to prove. As in the work on Kipling by McBride (2010a), this judgment really 
expresses the fact that s and t are equivalent syntactic presentations of u. 

The definitional equality includes type-directed rules that compare functions 
by applying them to a fresh variable, and compare pairs by computing their 
projections, thereby covering both the ??-laws and congruence for functions and 
pairs. A type is atomic if it is not a II- or S-type: this is used in the change of 
direction rule to ensure that the equality judgment is syntax-directed, as otherwise 
it would overlap with the rules for functions and pairs. 

The judgment 0 | Y h h • e ^ h" • e" ^ ti • e' e T, defined in Figure 4.5, 
means that the neutral terms h ■ e and b! • e' are definitionally equal with inferred 
type T and standard form h" ■ e". Note that there is no rule for inferring the 
type of a defined metavariable a :— t: T; rather, definitions must be immediately 
substituted out, which simplifies the presentation of the algorithm. 

The judgment 0 | T h P, defined in Figures 4.6 and 4.7, means that P is true, 
i.e. it follows from hypotheses in the metacontext. This defines a unification logic 
in the sense of Pfenning (1991a), where the separation of the metacontext from 
the context amounts to keeping all existential quantifiers outermost. In terms of 
the analysis by Martin-L6f (1996), this judgment says that P 'is true', whereas 
the judgment 0 | T h P wf says that P 'is a proposition'. 

I will sometimes omit the standard form, writing 0|rh T 3 s = t instead 
of 0 I T h T 3 s =[ u ]= t. The typing judgment 0 | T h T 3 t is defined as 
Q\T \- T 3 t = t, meaning the equivalence relation is reflexive on well-typed 
terms by definition. Similarly, I will sometimes write 0 | T h h ■ e e T instead of 
Q\T\-h-e^h' ■ e' ph-eeT. 
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0 h mctx 



(Q is a valid metacontext) 



0|- hPwf 
h mctx 0, ? P h mctx 



a#0 

0 | • h Type 3 T 
e,a:Th mctx 



a#0 0 | • hT9i 
e,a:=i:Th mctx 



© r h ctx 



0 h mctx 
01- h ctx 



0 | r h Type 9 T 

@\Y,x: T h ctx 



(T z's a waZfc? context in metacontext Q) 
x#F 

0|rh(g:Type)^(T:Type) 
e\T,x:StT h ctx 



0 r h p wf 



© I r h ctx 
0 1 r h t wf 



(P is a well-formed problem in 0 and F) 



0 | T h ctx 
0 I T h _L wf 



0 1 r h ctx 



0 I r h (Set : Type) w (Set : Type) wf 



Q\F,x:S h Pwt 



GIT h Vx:S.Pwf 



0 1 r h p wf 
©,?vr.p|r h Qwf 

© | r h p a g wf 
©|rh59 s ©|rh T9 * 

0|rh(g:Type)^(T:Type) 
0|Th (s:S) w wf 

0 |r,£:StT h Pwf 

©|r h vi:5tr.Pwf 



Figure 4.3: Well-formed contexts 
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e I r h t 3 s u 1= t 



( type T accepts s equal to t with standard form 



e | r h ctx 

0 | r h Type 9 Set =[ Set ji Set 

0 | T h Set 3 S 0 =[ U ^ 5i 



0 1 r h Set 9 5 =[ c/ ]= r 

0 I r h Type bS={U}=T 
&\T,x:Uh Set 3 r 0 =[ 7^ J\ 



0 | T h Set 3 Ux:S 0 . T 0 =[ Ux:U. V ]= Ux:Si. T x 

sx\s' tx^t! e\T,x:U\- V 3 s' ^u^t' 

0|T hlLr: U. V 3 s^Xx.u^ t 

0 | T h Set 3 S 0 =[ U ^ Si 0 I T, x: U h Set 3 T 0 =[ V ^ T x 

0|T h Set 3 Zx:S 0 . T 0 =[ Sx: [/. 7]= EziSi. J\ 

S HD 4J. 5o S TL JJ- Si 

t HD JJ- to t TL JJ. tl 

0 | T h [/ 9 s 0 
0 | T h I/{uo} 3 si =[ «i 1= 

0 |T h Sx: [/. 1/ 9 s ^ («o,Ui) P * 

0 1 r h ctx 



© I r h ctx 



©|r h 

S atomic 



3 it = it \= it 



0|Th Set 9 B =[ B 1= 

0 1 r h ctx 



0 | r h h ■ e =[ h" ■ e" 1= ti • e' G T 


0 | T h Type 95^f/^T 


0 | T h 5 9 h-e={h" 


e" ^ ti ■ e' 



Figure 4.4: Definitional equality: normal terms 
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9 | r h h ■ e =[ h" ■ e" ]= ti ■ e! G T 



Q 3 a:T 



e I r h ctx 



(h ■ e equals ti ■ e' with inferred type T ) 

r 3 x: t eir h ctx 



9|rha-»^«-.^a-»G T Q \T h x ■ • x ■ • }e x ■ • e T 

T3x:StT 0|rhctx T3x:StT 0 1 T h ctx 



e\F\- x-»^x-»^x-»e S 



©ir h x- 



0 


Thh-e^h 


'■e"}= ti -e' G Tlx: U. V 0 


U 3 u^u" 


0 


h-eu^h" -e" u" ]= ti ■ e' v! G V{u"} 



0 I r h h ■ e =[ h" ■ e" ]= ti ■ e' G Sx: U. V 
0 | T h /i • e hd =[ /i • e" hd p /i • e' hd G C/ 



0 | T h /i • e ^[ /i" • e" ]= /i' • e' G Sx: [/. 1/ 
0 | r h /i • e tl =[ /i" • e" tl ]= /i' • e' tl G • e" hd} 



0 | T h /i • e =[ /i" • e" ]= /i' • e' G 1 0 | T, x:B h Type 3 S U T 

0 | T h [/{tt} 3u=[u"^u' 0 | T h t/{ff} «" ¥ v' 

Q\T\-h- if (x .s) e uv^h" ■ if (xM) e u" v" ]= ti • if (l . r) e ?? i? G [/{/z" • e"} 



Figure 4.5: Definitional equality: neutral terms 



60 



e rhp 



(P is true in the unification logic) 



e | r h ctx 


e|r h 


jl e | r h p wf 


0 


\T,x:S^ P 


0 r h t 




e|r h p 


0| 


T\-Vx:S.P 



0|T h (S:Type) « (T:Type) 0|T h Type 9 5 =[ U ]= T 

0|r,x:5tThP 0|r,:r:[/hP{:r,:r} 

©|r h Vx:5tr.p ©|r h vi:5tr.p 

© |r h v^str.p 

0|rhVx:5.P 0|rhType9 ]= T 0 9? P 

0|rh59s ©irh ©irhctx 



0|rh p{ s } ©|rhP{n,n} ©|rhP 



0 


r h p © 


r h q © 


rhPAQ © 


r h p a Q 


0 


r h p a q © 


r h p © 


rh q 



Q\r\-Ux:A.B ^Ux:S. T 
0|T h A ^ SA^x:AtS. B{x} « P{a:} 

0 | T h Type 95^f/^T 
©IrhExiAP^ExrS. T 0|rh[/9s = i 



0|T h ^ « 5 AV£:Aj:£.P{£} « T{i} 0|T h (s:S) « T) 

0 |T h ctx 0|T h T) « (s:S) 

0 | T h (Set : Type) « (Set : Type) 0|T h (s:S) « T) 

0|rh(i o :r o )^(t 1 :T 1 ) 

0|rh (ti:7\) « (fe:T 2 ) r9£:Sf.T 0 | T h ctx 

0 |T h (i 0 : To) ~ (*2= T 2 ) 0 |T h (x :S) « (x: T) 



Figure 4.6: Unification logic 
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e | r hp 

0|T h (5: Set) « (U:Set) Q\T,x:StUh (T{£}:Set) « (F{x}:Set) 
6|r h (ILc:S. T:Set) « (ILr: [/. F:Set) 

0|r h (5: Set) « (t/:Set) 0|r,x:^C/h (T{f}:Set) w (F{x}:Set) 
0 |T h (Ex: 5. T:Set) « (Ex: £/. 7: Set) 

Q\r,x:SXU\- (sx: T{x}) (tx: V{x}) 
0|T h (s:Ux:S. T) w (^ILr: [/. 7) 



0|T h (shd:5) « (*hd: C/) 0 


T h (stl: T{shd}) « (tTL: V{tn D }) 


0 


T h (s:E:r:S. T) « (t:Ex: U. V) 



0|T h (n:Ux:S. T) « (r^ILc: [/. F) 0|T h (s:S) « (t: [/) 
0 | T h (ns: T{s}) « (r/j: 7{*}) 

0|T h (n:E:r:S. T) « (tt/iEx: C/. 7) 
0|T h (whd:5) « (r'hd: t/j 

0|T h (n:Zx:S. T) « (r/:Ex: P. 7) 



0 | T h (wtl: r{n HD }) w (nV: 7{t/hd}) 

0 |r,x:B h (T:Type) « (T':Type) 0 |T h (n:B) « (n^:B) 

0 | r h (fr : T{tt}) re (g : r'{tt}) 0 | r h (fr : T{ff }) « (g : r'{ff» 

0|rh(if (l . T) n kk:T{s}) « (if (l . n n' ^<:T'{ 5 '}) 



Figure 4.7: Unification logic: congruence rules 
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4.1.4 Twins 



Unification will require the incremental simplification of unification problems. In 
a heterogeneous setting, an immediate question is how to simplify the problem 

(s:Ux:S. T) « (t:Ux: U. V), 

since it would not be type-correct (absent typing modulo) to produce 

Vx:S.(sx: T) « (tx: V). 

We need x: S on the left and x: U on the right, and we need to know that they 
are the same x. This motivates the introduction of twin variables, allowing the 
problem to be simplified to 

Vx:StU. (sx: T{x}) « (tx: V{x}) 

where x and x represent the same variable at two different types, bound by 
x:S\U. The heterogeneity invariant (Subsection 4.0.3) means that S and U will 
be constrained to be equal by problems in the metacontext, but have not yet 
necessarily been unified. If the types become definitionally equal, the twins can 
be replaced with a single variable. On the other hand, the fact that they are 
different might not prevent the problem from being solved (if at least one of s 
and t is a constant function, for example). 

Twins bind a single name, but occurrences of the variable mark which twin 
they refer to. Thus they can be distinguished when typechecking, and substitution 
must replace them with a pair of terms that are provably equal. Of course, twins 
are bound as parameters of unification problems, not in terms, so /3-reduction 
never substitutes for twins. I write x ~ y if x and y are identical or twins. 

If unification problems were represented as types, twins could be distinct 
variables with a proof of their (propositional) equality; replacing them with a 
single variable would exploit the elimination principle for propositional equality. 

Definitional equality is tested in the algorithm when typechecking a candidate 
solution for a metavariable, but it treats twins as distinct, so the presence of 
twins may prevent a metavariable being instantiated with a purported solution, 
as indeed it should. When calculating the free variables of a term, the twin 
annotations are ignored, so fv(£) = fv(i) = fv(x) = {x}. 
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0 Th6:A 



(5 is a substitution from A to T) 



e I r h ctx 
eir h •:• 



0|T h 5:A 
e\T^5T3 t 



0 |T h 5:A 

0|Th (s:5S) « (t:«5T) 



0|T h (6,t/x):A,x: T 0|T h (<5, (s, t)/x):A,x:SXT 



0' h mctx 



0:0 C0' 0' 



(9 is a metasubstitution from 0 to Q') 

9 : 0 C 6' 6' | • ^9T3 t = 9s 
(0,*/a):9,a : = s: T C 0' 

h9T3t 0:9 □ 6' & \ ■ \- 6 P 



(9,t/a):Q,a: T C ©' 



Figure 4.8: Typing rules for substitutions and metasubstitutions 

4.1.5 Substitutions and metasubstitutions 

Figure 4.8 defines well-typed substitutions 5 and metasubstitutions 0. They are 
applied to terms as defined in Figure 4.2, and extended liomomorphically to 
syntax containing terms in the usual way. 

The judgment 0 | T h 5 : A means that 8 substitutes a well-typed term in T 
for every variable in A. Note that two provably equal terms may be substituted 
for twins, since twins are not required to be defmitionally equal. 

The judgment 9 : 0 C 0' means that 9 substitutes a well-typed term in 0' 
for every metavariable in 0. Moreover, any problem hypothesised in the original 
metacontext must be true somehow in the new metacontext. This allows meta- 
substitutions to be lifted to apply on derivations, as shown by Lemma 4.2 below. 
Thus they give rise to an appropriate notion of stability, as in Subsection 2.1.2 
(page 14). Two metasubstitutions are equivalent if they assign defmitionally equal 
terms to each metavariable, as defined in Figure 4.9. 

The identity substitution i on A includes x/x for each x: T and (x,x)/x for 
each x : S\ T in A. Weakening is silent, so 0 | T h i : A holds whenever T binds all 
the variables bound in A. 

I will also use i: 0 C 0' for metacontexts, to represent an identity or inclusion 
metasubstitution. If 0' contains definitions for some of the metavariables in 0 
then these definitions will be expanded by t, to maintain the invariant that well- 
typed terms are always /3£-normal. 



64 



9 = 9':Q c 6' 



(9 and 9' are equivalent metasubstitutions from 0 to Q') 



0' h mctx 



9 = 9':& C O' 

9' I • \- 9T 3t=t' 0' I ■ \~9T3t' = 9s 



(0, t/a) = (0', f'/a) : 0, a := s : T E 0' 



0' I • V9T3t = t! 



9 = 9':QQQ' 

e'l- 



(MAO = {p',tila):Q,a\ T E ©' 



0 = 0':0,?P E ©' 



Figure 4.9: Equivalence of metasubstitutions 



4.1.6 Properties 

All the usual metatheoretic properties hold. Where proofs have been omitted, 
they are by structural induction on derivations. 

Lemma 4.1 (Substitution). Suppose 0 | T h <5: A. Then 



(a) //0|r,A,r' h ctx thenQ\r,Sr' h ctx. 

f&j //0|r,A,r h p w f ^en e|r,«jr' h <*Pwf. 

(cj //e|r,A,r'hr9s^«^i */iere0|r,6r / h6T9 5s=[i/]=<H. 

fdj // 0 | T, A, r' h /io • e 0 ^ /i 2 • e 2 ^ /ii • ei e T then 
e\T,5T'\- 5T 3 5(ho-e 0 ) ={u}= 5 (h ■ a). 

(e) //0|r,A,r h P ^en0|r,5r' h 5 P. 

(7; //0|r,A,r' h^r" ^©ir^r i-<s-<$':<?r". 

Lemma 4.2 (Metasubstitution). Suppose 0:0 E 0'. 
(a J // 0 | T h ctx then Q'\9TV- ctx. 
f&J 7/ 0 | T h P wf then &'\9Th9P wf . 

fcj//9|rhT9sE[^t f/ien & \9Y ^ 9 T 3 9 s ^ v}= 9 t. 

(d) If Q\T h h ■ e ^ h" ■ e" ^ h' ■ e' e T then & \ 9F \- 9 T 3 9 (h ■ e) = 9 (h' ■ e'). 

(e) If 0 | T h P toen 0' | #r h 0 P. 

f/j //0|rh5:A tfien 0'|#rh#5:#A. 
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Lemma 4.3 (Sanity conditions). 

(a) If 0 | T h ctx then 0 h mctx. 

(b) If 0 | T h P wf £nen 0 | T h ctx. 

fcj //9|rh TBs^tt^f £/ien 0 | T h ctx. 

(dj // 0 | r h /i • e =[ n" • e" ]= n' • e' e T then 0 | T h Type 9 T. 

(e) If 0 | T h P £nen 0 | T h P wf . 

Lemma 4.4 (Definitional equality is an equivalence relation), 
fa,) // 0 | T h T 9 t then 0 | T h T 3 t = t. 

(b) IfQ\Y^T3s=[v^t then Q\Y ^ T 3 t ^ v}= s. 

(c) If 0 | T h T 9 ^ a ]= and 0 | V h Ts^^f tfien u = v and 
&\Th T 3 t^v 1 ^ t". 

Proof. Reflexivity, part (a), is precisely the definition of the typing judgment. 

Symmetry, part (b), is by structural induction on the derivation. Since the 
standard form is preserved, it is easy to establish symmetry, because the rules use 
the standard form rather than choosing one side arbitrarily (and asymmetrically). 

Transitivity, part (c), is by structural induction on the first derivation and 
inversion on the second. The rules are syntax-directed, so in each case, the last 
rule of the second derivation must be the same as the last rule of the first. □ 

Lemma 4.5 (Context conversion). Suppose 0 | T h Type 3 S = T. Then 



(a) 0 


r, x 


S, A h ctx implies 0 | T, x: T, V h ctx; 




(b) 0 


r,x 


S,F'\- P wf implies 0 | T, x : T, V h P wf ; 




(c) 0 


r, x 


S,V\- U 3 s^u^t implies 0 | T, x: T, V h U 3 s ^ 


a ]= t; 


(d) 0 


r,x 


S, T' h h ■ e ^ h" ■ e" ^ a' • e' e C/ implies there is some 


V such that 


0 


r,x 


T,r'\-h-e={ h" -e"^h'-e' e V and 




0 


r,x 


T, T' h Type 3 U = V; 




(e) 0 


r, x 


S,r\- P implies 0 | T, x : T, V h P. 




(f) © 


r,x 


S,T' h 5: A impZies 0|r,x: T, T h 5: A. 





A similar result applies to twins. 

Lemma 4.6 (Conversion). If Q \ Y h Type 3 S = T and Q\T h S 3 s ^ u 1 ^ t 
then Q\Th T 3 s^u]=t. 
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4.2 Specification of unification 

Having shown how to represent unification problems in context, let me address the 
question of how to solve them. Following the approach of the previous chapters, 
the idea is always to make small, local changes to the metacontext, each of which 
is type-correct, makes the problem simpler and makes no unforced intensional 
choices. This ensures that any solution found is most general. 

In the following subsections, I will examine the range of problems one might 
encounter, and discuss the step to take in each case. Then I will summarise all 
the steps of the algorithm. The steps are divided into five main groups: 

• solving equations of the form a xi l ~ t by a :— A xi l .t (Subsection 4.2.1); 

• solving equations a x^ 1 ~ a Yi l by limiting the domain of a (Subsection 4.2.2); 

• gaining information via pruning (Subsection 4.2.3); 

• simplifying metavariables by removing S-types (Subsection 4.2.4); and 

• simplifying problems locally (Subsection 4.2.5). 

The rules are not deterministic, as they permit working on problems in any 
order, but the nondeterminism does not matter: every step is most general, so the 
order will not affect the final result. A deterministic algorithm can be obtained 
from the rules by choosing a suitable order (such as leftmost problem first). 

Since definitions must be immediately substituted out, in order to keep ev- 
erything 5-normal, I write 6, a :=* t: T, S to represent 0, a := t: T, [t/a]E. 

4.2.1 Solving problems by inversion 

Given the metacontext 

6, a : T — > T, ? Wx: T. a x pa x, 

where the equation looks like a definition, it should be unsurprising that 

6, a := Xx.x: T T,?Vx: T.xpix 

is a most general solution. Miller (1992) observed that, in general, the problem 
VT. axi 1 ~ t has unique solution a := A xl 1 .t provided that the evaluation context 
of a is a list of distinct variables containing all the free variables of t, and a does 
not occur in t. 
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On the other hand, an equation like ait « t is not a good definition for a: 
taking a := Xx.t is a solution but is not most general, because another equation 
might require aff ~ s for some s ^ t. Defining a by case analysis is not most 
general as it makes an unforced intensional choice: a later equation might demand 
a ~ Ax.ff . Intuitively, Miller's pattern condition says that only an application 
to variables 'captures the whole nature' of the metavariable; an application to 
non-variables only determines it for those specific arguments. 

Linearity 

It is crucial that variables occurring in t appear linearly (exactly once) in 3^*. 
The equation f3 X X ~ X cannot be solved immediately, as f3 could project either 
its first or second argument, so there is no unique most general solution. On the 
other hand, •yyxy ~ x can be solved unambiguously by 7 := \y 0 xy 1 .x despite 
the repetition of y. The xl % may include twins, which are treated as equal for the 
purposes of this check. 

Occurs check 

If a occurs in t, then it is obviously unsound to use t as the definition for a. 
However, the question of whether the problem can have a solution at all is more 
subtle, and depends on the exact form of the occurrence. 

A subterm occurs flexibly if it is in the evaluation context of a metavariable, 
and rigidly if not. In the term ax — > y z, a, y and z occur rigidly while x 
occurs flexibly. Miller (1992, p. 26) describes rigid occurrences as 'permanent' and 
flexible occurrences as 'possible', because flexible occurrences might be removed 
by substituting for metavariables but rigid occurrences cannot. A rigid occurrence 
is strong if it is not in the evaluation context of a variable, so no substitution for 
variables can remove it. In the example, y occurs strong rigidly but z does not. 

I write fmv(t) for the set of free metavariables and fv(i) for the set of free 
variables of t. Either may have a - ng or • sng superscript to include only those that 
occur rigidly or strong rigidly (respectively). 

Reed (2009b, §5.1.5) observes that when performing the occurs check before 
solving a metavariable, a problem is definitely unsolvable if 

• the metavariable occurs strong rigidly in its own candidate solution, such 
as in a x ~ a it — > a ff ; or 

• an application of the metavariable to variables occurs rigidly in its own 
candidate solution, such x(ax). 
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If a weak rigid occurrence of a metavariable is applied to non- variables, the prob- 
lem may have solutions, for example (3 y ~ y(j3(Xx.x)) is solvable (by taking 
(3 := Xy.yit, amongst other things). Again, Miller's pattern condition appears: 
only an application to variables determines the whole nature of a metavariable. 

Permuting the metacontext 

If t depends on some metavariables declared after a, these must be moved prior to 
a for the definition to be well-scoped. However, other metavariables may depend 
on a, so they must remain after it. For example, given the context 

6, a : Set, (5 : Set, 7 : a, ? a « (5 ->• (5 

an appropriate solution is 

6,/3:Set,a := (3 ->• /3:Set, 7 :/3 ->• /3. 

In general, solving 

B,a: T,S,?Vr.ao5 8 ' w * 

requires finding a dependency-respecting permutation of S into two segments So 
and Si (written S = S 0 , Si), where S 0 contains all the metavariables that occur in 
t and its type, and does not depend on a. If the necessary permutation does not 
exist, then a cannot be solved immediately, though solving other metavariables 
may remove the dependency cycle. The existence of such a permutation can be 
determined in a small-step fashion by scanning dependencies from right to left, 
as in the instantiation judgment for first-order unification (Figure 2.5, page 21). 

Typechecking 

Once the algorithm has a candidate solution Xxi % .t for a, it must check that 
the solution is well typed, as heterogeneity means that this is not guaranteed. In 
particular, the type of t might not be definitionally equal to the type of a xl \ or if 
some twin variable y occurs in 35* and y occurs in t, then the solution will not be 
valid until the types of y and y become definitionally equal. Strictly speaking it 
is not necessary to fully recheck the solution: it is enough to test these conditions 
directly and rely on the fact that the original problem was well-typed. A real 
implementation would record the desired solution for a and the constraints that 
must be solved before it can be applied, as in Agda (Norell, 2007, Ch. 3). 
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intersect • • • H> • 

intersect Axi l Yi% z: S if x ~ y 
intersect A Xi l Yi % otherwise 



intersect (A, z: S) (xi\ x) (y/, y) >->■ 



Figure 4.10: Intersection 

4.2.2 Solving flex-flex problems by intersection 

As well as equations between an eliminated metavariable and an arbitrary term, 
some equations have the form a ■ e ~ a ■ e', with the same metavariable on 
both sides but different evaluation contexts. If both contexts are applications of 
lists of variables, then a most general solution is given by restricting a to those 
arguments on which the two lists of variables agree. For example, a solution of 

6, a : T — > T — > T, ? Vx : T. Vy : T. a x x « a y x 

is possible only if a does not depend on its first argument, giving 

9,/3: T ->■ T,a := \_.f3: T ->• T ->■ r,?Va:: T. (/3x: T) w (/3a;: T) 

where /3 is a fresh metavariable. 

Figure 4.10 defines the operation intersect A ~xi l Yi% which takes a telescope A 
and two lists of variables to fit it, and produces the telescope on which they agree. 
Twin variables are considered equal for the purposes of intersection, though in 
any case, twins could be replaced with a single variable since they must share a 
common type. Given the context 

6,a:IIA. T,S,?Vr.a^^ ss ayi { 

the problem is solved by creating a fresh metavariable j3 and defining 

9, /3:11a'. T,a :=* AA./3A':nA. T, S where A' = intersect A 3^ ^ 

provided the free variables of the codomain T are retained in the telescope A'. 

In LF, one can define intersection for arbitrary argument lists that contain 
no metavariables, but this is not possible in a type theory with large elimination. 
For example, a tt x ~ a tt y does not imply that a is independent of its second 
argument, as it might be defined by case analysis on its first argument. 
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4.2.3 Pruning 

The problem in 

B,a:(T -»• T) -»• T,?Vx:(T ->■ T).Vy: T. a x « xy 

is unsolvable, because there is no way for ct to depend on ?/, since it does not 
occur as an argument on the left-hand side. On the other hand, 

G,p:T -> T,a:(T -> T)-> T,7Vx:(T -> T).Vy:T.axt*x(py) 

can be solved by observing that j3 may not depend on its argument, so it must 
be of the form A .7 for some fresh metavariable 7. This gives 

6,7: T,p := A_. 7 : T -> T,a:(T ->■ T) -»• T,?Vx:(T ->■ T).Vy: T.as « £7 

which can be solved by 

0,7: T,/5 := A_.7: T -> r,a := Ax.a;7:(r -»■ T) ->• T. 

For a problem of the form VT. a • e « i to be solvable, all the free variables of t 
must occur in e; otherwise, they will be out of scope for solutions of a. If any out- 
of-scope variables occur rigidly in t, then the equation can never be solved. If an 
out-of-scope variable occurs flexibly, in the evaluation context of a metavariable, 
then it might be possible to remove the occurrence by pruning the metavariable, 
restricting its telescope of arguments. 

Pruning cannot always remove occurrences of out-of-scope variables. For ex- 
ample, pruning the equation \/x: T. a ~ j3 (7 x) fails because it is not clear which 
metavariable ignores its argument: either j3 or 7 could be constant, so there is 
no most general solution. In this situation, the unification algorithm will have to 
tackle other constraints, which may result in the problem becoming easier. 

Moreover, knowing j3 it x cannot depend on x does not mean that j3 cannot 
depend on its second argument, because it might be defined by case analysis on 
the first argument (so removing other arguments might lose solutions). Pruning 
therefore retains arguments only if they are variables, failing otherwise. Once 
again, Miller's pattern condition appears: a constraint captures the entire be- 
haviour of a metavariable only if the metavariable is applied to a list of variables. 
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pruneTm Vt H> (f3, A) 



0 9/3:nA. T pruneVAt/ ^ A' 

fv(T) C vars(A') A ^ A' 



(pruning t to V requires (3 to have telescope A) 
pruneTm VS \-> ((3, A) 



pruneTm V(/3 0 ^ A') 

pruneTm (V U {x}) T ^ A) 
pruneTm V(IIx:S. T) ^ A) pruneTm V (Ex: S. T) ^ (/3, A) 



pruneTm V(ILr: S. T) ^ (f3, A) 
pruneTm V 5 H> (/3, A) 



pruneTm (V U {x}) T ^ ((3, A) 
pruneTm V (Ex: S. T) i— >■ (/3,A) 

pruneTm V t h-» A) 

pruneTm V (s, *) ^ ((3, A) 



pruneTm Vs i-> A) 

pruneTm V (s, *) (f3, A) 

pruneTm (V U {x}) f (/3, A) 



pruneTm V (Ax. f) (/3, A) 

pruneTm Vs H> (/3,A) 
pruneTm V (x • (e s • e')) >->■ A) 

pruneTm (V U {?/}) T ^ (/3, A) 
pruneTm V (x • (if (y.r) e s£-e')) H> (/3, A) 

pruneTm Vs >->• ((3, A) 
pruneTm V (x ■ (if e st-e')) H> (/3, A) 

pruneTm Vt ^ (/3, A) 
pruneTm V (x • (if e st-e')) >->■ (/3, A) 



prune V At/ A' 



prune V • ■ H> 



(pruning arguments tC in A to V gives telescope A') 

pruneVAtT^A' V e V fv(5) C vars(A') 
prune V (A, x : 5) (V , j/) ^ A',x:5" 

pruneVAtT^A' fv rig (s) £ V 
prune V (A, x: 5) (V , s) i— >■ A' 



Figure 4.11: Pruning 
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Pruning uses two auxiliary relations defined in Figure 4.11. Both depend on 
a set V of variables that may occur in arguments, which will initially be fv(e) and 
will accumulate locally bound variables. 

• The relation pruneTmVi H> A') means that t has an occurrence of /3, 
whose telescope has been pruned to A'. This works by searching t for a 
subterm j3 U 1 then using the following function. 

• The relation prune V A \- > A' computes the pruned telescope A' for (3, 
where A is its original telescope and U 1 is the list of its arguments. 

These relations are partial, as pruning may fail, and the former is nondetermin- 
istic, as there may be multiple ways to prune a term. The nondeterminism does 
not matter, however, as pruning is always a most general step and can be applied 
repeatedly if necessary. 

To prune a telescope A, x: S corresponding to the list of arguments tC , s, the 
preceding telescope A is pruned with the list of arguments U % . If this succeeds, 
producing A', then there are three possible cases: 

• if s is a variable y G V, whose type depends only on variables that remain in 
the pruned telescope A', then the binding x: S can be left in the telescope; 

• if s has a rigid occurrence of a variable not in V, then the binding must be 
removed from the telescope; 

• otherwise, pruning fails. 

If s has a flexible occurrence of a variable not in V, pruning fails because while 
the whole term cannot depend on the variable, it is not clear which metavariable 
projects it away, as in the a ~ j3 (7 x) example. 

Note that the potential presence of type dependencies means pruning must 
check the well-formedness of types. For example, if ft : Tlx : S. T where x occurs 
free in T, then the first argument of j3 cannot be pruned. 

For the earlier example 

6,/3: T ->• T,a:(T ->• T) -»• T,?Vx:(T -»• T).Vy: T.ax w x(/3y) 

we have pruneTm {x} (x ((3 y)) >->■ (j3, ■ ), because y does not occur in the set of 
allowed variables {x}, so prune {x} (z:T)y \-t (■) , i.e. the telescope z: T of (5 is 
pruned to the empty telescope. 
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In the metacontext 

e,/3:IIA. T,S,?Vr.a-e^ t, 

if pruneTm (fv(e)) t >->■ A') and all the variables in T are retained in A' then 
pruning j3 results in the metacontext 

0, 7:11a'. T,(3:=* AA.7 A':IIA. J 1 , E, ?VT.a • e « £ 

where 7 is a fresh variable. This restricts the telescope in a similar way to 
intersection, though it does not apply to a but a different metavariable. 

4.2 .4 Metavariable simplification 

Suppose ttiSxiS'. T; how might the constraint am ^ s be solved? One option is 
to extend the pattern fragment to cover projections, as Duggan (1998) does for 
System F u , but I take the simpler option of aggressively lowering metavariables 
to eliminate projections. In this case, replacing a with the pair (/3 0 , of fresh 
metavariables f3 0 :S : (5\ : T{/3 0 } simplifies the constraint to f3 0 s. 

In general, the metavariable a might be under a telescope of parameters, so 
a : IIA. : S. T can be replaced with 

a 0 : nA. S, ai : IIA. T{a 0 A}, a := \A.(a 0 A, a x A). 

Similarly, a metavariable a : Tlx : (Ey : S. T). U can be uncurried to produce 
(3 : Uy : S.Uz: T. [(y,z)/x] U, which will transform the non-pattern constraint 
a (y, z) ~ t into the pattern ay z fa t. The general case is even worse here, as a 
might have a telescope of parameters and the type of x might have parameters 
preceding the S. Thus a: HA. Ila;: (IIA'. Hz: S. T). U can be replaced with 

G^:UA.Uy:(UA'.S).Uz:(UA'. T{yA'}). U{XA'.(y A', z A')}, 
a := XA.Xx.(3A(\A'.xA'b D ) (AA'.x A'tl). 

These transformations maintain the same set of solutions thanks to the ?7-rule 
for E-types, otherwise known as surjective pairing, (^ti hd, ti tl) = v n. This is built 
into the definitional equality by the rule for pairs, which always ^-expands the 
terms being compared. 
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4.2.5 Problem simplification 

The problem decomposition operation P \=> Q locally replaces a problem with 
a simpler problem without changing the rest of the metacontext. Each decom- 
position step can be applied in an arbitrary context. Thus P Q means that 
0,?VT. P can be replaced with @,?VT. Q. Additionally, conjunctions can be 
split into their components, replacing 6, ?VT. P A Q by 0, ?VT. P, ? VT. Q, and 
trivial problems can be removed, replacing G, ? T with G. First I will discuss 
the decomposition steps, then later summarise them in Figure 4.14. Steps are 
numbered for ease of reference. 

Perhaps the most basic simplification step is the removal of equations that 
are reflexive up to the definitional equality, and hence trivial: 

(s:S)fa(t:T) & T (4.1) 

if 0 | T h Type 3 S =[ U ]= T and G|T h U 3 s = t 

r/-expansion 

Given an equation between two functions, we saw in Subsection 4.1.4 that both 
sides can be 77-expanded, even if the domains are not definitionally equal, by 
introducing twin variables. Thus a ~ Xx.t becomes ax ~ t{x}. Similarly, pairs 
can be r/-expanded, for example turning (a, 0) ~ s into asiSHD and (3 ~ stl. 

(f:Ux:S. T) m (g:Ux: U. V) & (4.2) 

Vx:StU. {fx:T{x}) w (gx: V{x}) 

(s:T,x:S. T) ph (t:T,x: U. V) & (4.3) 
(shd:S) w (*hd: U) A (stl: T{skd}) (*tl: 7{^hd}) 

Rigid-rigid decomposition 

A rigid-rigid equation is one where neither side is a metavariable in an evaluation 
context, so either the same head symbol appears on both sides, or the equation 
is unsolvable. For example, Tlx : S. T Tlx : U. V can be decomposed into 
S w f/A T « V, though twins must be used because 5 and C/ might not be 
definitionally equal. A similar decomposition applies to S-types. 

Ux:S. Tfa Tlx: U. V & S w C/ A Vx: 5f C/. T{f } w (4.4) 

Sx:5. Tw Ex: C/. F ^ 5 w C/ A Vx: 5f C/. T{f } w V{x} (4.5) 

If the equation is between two eliminated variables, x • e ~ x' • e', it can be 
decomposed into equations between the arguments contained in the evaluation 



75 



x • • ix x' • • i — y T if x ~ x' 

x • (e s) x x' • (e' t) i — y i • e m i' • e' A s ~ t 
x • (e hd) x x' • (e' hd) i — y x • e x x' • e' 
x ■ (e tl) x x' • (e' tl) i— >■ x • e x x' • e' 
x ■ (if {y .T) e st) \xix' -(if (y.T^e' s'tf) ^ x • e x x' • e' A (Vy:B. T « 

A s ~ s' A £ ~ 



Figure 4.12: Evaluation context decomposition 



and t are rigidly incompatible) 
c ^ c' 



Ux:S. T ±Zy:U. V Ux:S.T±c Zx:S.T±c c X c 
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x-eJLUy.S.T x ■ e X T,y: S . T x ■ e X c x ■ • X x' • • 



i-tli'-es x • • X x ■ e hd x • • X x ■ e tl x ■ • X if ^ T ^ x' ■ e s t 



x ■ e s X x ■ e \ 



i-eslx'-eTL x ■ e s 1- if ( y ,T) x • e s t 
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x • eTL X if( v .T) x 1 • e st 



x ■ e 0 • e\ X x' ■ e' 0 ■ e[ 



s X t 
t X s 



Figure 4.13: Impossible constraints 
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contexts, provided they match. For example, the problem 

Vx:(S ->■ U x U)t(T ->VxV). (xshd: U) « (x^hd: 7) 

decomposes into the equation (s: 5) « (£: T). On the other hand, i/HDRi ?/tl has 
no solutions, because the projections do not match. 

The evaluation context decomposition function x-e rxi x'-e', which is defined in 
Figure 4.12, computes the conjunction of problems required to make x-e ~ x'-e'. 
It is made available via the step 

x ■ e fa x' ■ e' x ■ e M x' ■ e' (4.6) 

The outermost eliminator in the evaluation context is decomposed first, with the 
equality of the variables (ignoring twin annotations) being checked last, to allow 
for extension to handle proof-irrelevant types. 4 

The evaluation context decomposition function is partial because a mismatched 
equation like x ~ y for distinct x and y, or i/hd Ri ?/tl, has no solutions. Sim- 
ilarly, equations between dissimilar canonical constructors (such as tt m ff) are 
unsolvable. To capture this, Figure 4.13 defines the relation sit, meaning that 
s and t are rigidly incompatible, so s m t can never be solved. The step 

s « t h> JL if s ± t (4.7) 

allows _L to be derived from such a contradiction. This definition depends on the 
fact that equations are being solved up to the intensional definitional equality: 
(x:S) (y'S) can be solved up to extensionality if S has only one inhabitant. 5 

^-contraction of subterms 

Miller's pattern condition requires that a metavariable should be applied to a 
list of variables. As the definitional equality includes reconversion, however, 
it is enough for the arguments to be r/-contractible to variables. For example, 
a (Xx.yx) ~ t can be r/-contracted to a y ~ t, potentially allowing the solution 
a := Xy.t. This motivates the steps 
P{\x.nx} & P{n} (4.8) 

P{(nHD,n TL )} ^ P{n} (4.9) 

that permit 77-contraction anywhere inside problems. In practice, these are useful 
only to make steps that depend on the pattern condition apply, so an implemen- 
tation would perform r/-contraction only when testing the pattern condition. 

Eliminations of an empty type can be equal even if the eliminated terms are not equal. 
5 Also, given proof-irrelevant types, the definition of s X t would need to check that the 
types were not proof-irrelevant (and could not become so after instantiation of metavariables). 
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Parameter simplification 

Parameters that do not occur in the problem can be discarded by the four steps 

Vx:T.P ^ P if x £ fv(P) (4.10) 

^x:S\T.P & Pifx^fv(P) (4.11) 

\/x:StT.P{x} & Vx:S.Pii &\T,x:Sh Pwf (4.12) 

Vx:StT.P{x} & Vx: T.P if Q\V,x\ T h Pwf (4.13) 

The point of these steps is to remove unnecessary dependencies, making it eas- 
ier to compute the dependency-respecting permutation required when solving a 
metavariable by inversion. Again, they depend on intensionality, because exten- 
sionally a problem that quantifies over an empty type is trivially solvable. Here 
0 and T are implicitly parameters to the decomposition relation h^, used in steps 
(4.12) and (4.13) to emphasise that P depends only on one of the twins. 

Given a pair of twins whose types are definitionally equal, they can be replaced 
with a single variable, potentially allowing further progress. For example, the 
problem \/x:S\S. s{x} t{x} becomes Vx:S. s{x} t{x}. 

Vx:S\T.P & Vx:U.P{x,x} (4.14) 

if 0 | T h Set 3 S =[ U ^ T 

If a parameter has a S-type, it can be replaced with two parameters in order 
to eliminate projections from equations, as in metavariable simplification (Sub- 
section 4.2.4). For example, the problem \/x : (Ey : S. T).a(xii) xs t{x} can 
simplify to \/y: S, z: T. a z ~ t{(y, z)}. This simplification happens by the step 

Vx: (UA.Ex 0 : S. T).P & (4.15) 

Vy: (nA. S), z: (nA. T{yA}). P{XA.(y A, z A)} 

4.2.6 Summary of the algorithm 

Figure 4.14 summarises the problem decomposition steps, and Figure 4.15 sum- 
marises the steps for transforming the metacontext, discussed in the previous 
subsections. In addition to the steps already discussed, the latter figure includes 
the symmetry step (4.26), which saves writing out symmetrical variants of all the 
other steps, and the suffix step (4.27), which allows other steps to be applied at 
an arbitrary point in the metacontext. 

Any variables that appear on the right but not on the left are implicitly 
assumed to be freshly generated, so they do not conflict with any existing names. 
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Reflexivity 



(s:S)&(t:T) & T 

if 6 | T h Type 3 S =[ U ]= T and 9 | T h U 3 s = t 




^-expansion 








(f:Ux:S. T) « (#:IIx: [/. F) 


Vx:S$U. (fx:T{x}) ^ (gx: V{x}) 


(4.2) 


(shd: 5) ~ (tun 


: U) A (stl: T{shd}) ^ (*tl: 7{*hd}) 
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5 w UAVx:StU. T{x} w 7{x} 


(4.5) 


x • e ~ x' • e' 


l=>- 


x • e ixi x' • e' 


(A «\ 

(4.b) 


s & t 


l=>- 


JL if s X * 


(4.7) 


77-contraction of subterms 








P{\x.n x} 




P{n} 


(4.8) 


P{(U BD, utl)} 


l=>- 


P{n} 


(4.9) 


Parameter simplification 








Vx: T.P 




P if x £ fv(P) 


(4.10) 


Vx:S}T.P 


l=>- 


P if x £ fv(P) 


(4.11) 


\/x:StT.P{x} 


l=>- 


Vx:S.Pif 0|r,x:ShPwf 


(4.12) 


Vx:S}T.P{x} 




Vx: T.P if 9|r,x: T h Pwf 


(4.13) 


Vx:S}T.P 




Vx: t/. P{x,x} 

if e | r h Set a s =[ c/ ^ r 


(4.14) 


Vx:(nA.Exo:£. T).P 


!=>■ 




(4.15) 



\/y: (I1A. S), z: (IIA. T{y A}). P{AA.(y A, z A)} 
Figure 4.14: Problem decomposition steps 
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Solving equations by inversion (4.2.1) 

Q,a: T^lVT.axl* « t \-> Q,E 0 ,a :—* Xx~ i i .t: T, E 1 (4.16) 
if S = S 0 , Si; Xi 4 is linear on fv(£) and 0, S 0 | • h T 9 A a^ 8 .t 

e,?vr.air~£ ^ e,?± (4.17) 

ii t 7^ a • e' and either a G fmv sng (i) or ay i t occurs rigidly in t 
Solving flex-flex equations by intersection (4.2.2) 

Q,a:UA. T^lVT.axi* « ayi* *-> 0,(3:11 A'. T,a :=* AA./3 A', S (4.18) 

if A' = intersect A ^jfc* and fv(T) C vars(A') 

Pruning (4.2.3) 

0,p:UA. T,S,?Vr.a-e « £ ^ (4.19) 

e,7:nA'. T,/3 :=* AA. 7 A',S,?Vr.a-e« * 
if pruneTm (fv(e)) * ^ (/3, A') 

e,?VT.a-e«< ^ 0,?± if fv ri s(t) G: fv(e) (4.20) 
Metavariable simplification (4.2.4) 

Q,a:UA.i:x:S. T ^ (4.21) 
9,a 0 :nA. 5,ai:IIA. T{a 0 A}, a := XA.(a 0 A, en A) 

Q,a:UA.Ux:(UA'.'Ez:S. T). U (->■ (4.22) 

e^rnA.ny^nA'.^.n^^nA'. r{yA'}). [/{aa.^a^a')}, 

a := AA.Ax./? A (AA'.x A' hd) (AA'.x A' tl) 
Problem simplification (4.2.5) 

e,?vr.p ^ e,?vr. qhp^q (4.23) 

e,?vr.PA^ ^ e,?vr.p,?vr. q (4.24) 

e,?T ^ e (4.25) 

Symmetry and metacontext suffix 

e,?vr. s^t ^ e ; if e,?vr.*« s^o' (4.26) 

e,s ^ e', ts if e i-> e' (4.27) 

Figure 4.15: Constraint solving steps 
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4.3 Correctness 



In order to prove that unification correctly solves equational problems, I must 
first explain what it means for a problem to be solved. I will show that the 
unification logic is consistent, and that the steps of the unification algorithm are 
sound for the logic. Moreover, I will prove that every step is most general (in 
an appropriate sense). Total completeness cannot be expected, but I will show 
a partial completeness result for the pattern fragment under the assumption of 
termination. However, it is difficult to prove termination and I conclude this 
section with a discussion of the problems involved. 

4.3.1 Solved problems and logical consistency 

An equation (s : S) ~ (t : T) is solved if it is true according to the definitional 
equality, i.e. 0 | V h Type 3 S ^ U ]= T and 0 |T h U 3 s = t. More generally, 
a problem is solved if the equations it contains are true in the definitional equality. 
This is captured by the judgment 0 | T h P is, defined in Figure 4.16. This 
requires twins to have equal types, so they can be replaced with a single variable. 

Solved problems satisfy the expected substitution properties, proved by struc- 
tural induction on derivations using Lemma 4.1 and Lemma 4.2: 

Lemma 4.7. If&\ThS:A and 0 | T, A, V h P is then 0 | T, 8V h 5P is. 

Lemma 4.8. // 6: 0 C & and 0 | T h P is then Q'\0T h 6 P is. 

A metacontext is solved if all its hypothesised problems are solved. If a prob- 
lem is solved, it is true, that is, if 0 | T h P is then 0 | V h P. I will show that 
the converse holds provided 0 is solved: problems assuming only solved hypothe- 
ses are themselves solved. This is essentially a cut elimination or normalisation 
result, as it says that any proof of a problem can be reduced to a normal form, 
with the normal form proofs of equations being definitional equalities. 

In Subsection 4.3.2, I will show that unification steps are sound in the sense 
that they preserve provability of problems. Hence, if the algorithm steps to a 
solved metacontext, then the problems it started from must be solved. 

The potential presence of twins forces me to prove a slightly more general 
result, which allows any twins in the context to be replaced with definitionally 
equal terms. The desired result for the empty context is then an immediate 
corollary. Say that a substitution 0 | A h 5 : T identifies twins if for all x : S\ T E T 
we have 0 | A h Type 3 S S ^ U ^ 5 T and &\Ah U 3 Ss = 5t. 
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0 r h P is 



(P is solved in 0 and Y) 



e I r h ctx 



0|r,:r:ShPis 



6|rhTis 0|T h V:r:S.Pis 

0 I r h Type 3 S=[U^T 



0 | T h Type 3 S^U^T 
0|T,x: U\- P{x,x} is 

0|T hVx: is 



©|rh c/9 


s = f 


© I r h ctx 


©|r h (s 


5) « (t: T) is 


0 | r h (Set : Type) « (Set : Type) is 




0 1 r h p is 


0 1 r h g is 




©|r 


- P A Q is 



Figure 4.16: Solved problems 

Lemma 4.9. Tjf 0 is solved, 0 | T h P and 5 is a substitution from T to A that 
identifies twins, then 0 | A h <5P is. 

Proof. By induction on the derivation of 0 | T h P. The absence of hypothet- 
ical problems or first-class quantification over problems makes it easy to show 
that the rules of the unification logic (Figure 4.6) correspond to solved problems 
(Figure 4.16). For details, see Appendix D.3.1 (page 242). □ 

Corollary 4.10. // 0 is solved and 0 | • h P then 0 | ■ h P is. 

Corollary 4.11 (Consistency). IfQ is solved, there is no derivation ofQ \ ■ h _L. 

A metasubstitution 0 : 0 C 0' is a solution of 0 if 0' is solved. Now if ? P e 0, 
then 0' | • h 0 P by Lemma 4.2, and hence 0' | • h 9 P is by Corollary 4.10. 

4.3.2 Soundness 

Since the algorithm works in small steps, it is easy to verify that each is type 
safe. All permutations of the metacontext respect dependency. Whenever the 
algorithm instantiates a metavariable, it does so with a term of the appropriate 
type. Moreover, every unification problem is replaced with an equivalent conjunc- 
tion of unification problems. Crucially, the algorithm uses heterogeneous equality 
to make it easy to represent the telescopes of equations that arise from dependent 
arguments, potentially allowing progress on some equations even if the equation 
that makes their types equal is initially blocked. Despite this, and unlike typing 
modulo, every solution is well typed up to the definitional equality, making the 
algorithm useful when mixing typechecking with elaboration. 
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Lemma 4.12. // 9 | T h P wf and P & Q then 

(a) 6 | T h Q wf , and 

(b) 0|T h Q implies 0|T h P. 

Proof. By case analysis on the decomposition step. I must show that the truth 
of Q implies the truth of P, so that replacing a hypothesis ? P with ? Q leads to 
a valid metasubstitution. For details, see Appendix D.3.2 (page 245). □ 

Lemma 4.13. If Q h mctx and 0^0' then t:0 □ 0'. 

Proof. By induction on the step taken, using Lemma 4.12 for problem decompo- 
sition. For details, see Appendix D.3.2 (page 246). □ 

Theorem 4.14 (Soundness). // 0 h mctx and 0 H>* 0' where 0' is solved, 
then t:Q Q Q' is a solution ofQ. 

Proof. Follows from Lemma 4.13 by induction on the number of steps. □ 
4.3.3 Generality 

The algorithm is carefully designed to make no unforced intensional choices: that 
is, metavariables are instantiated only if the value is unique up to definitional 
equality. This corresponds to finding most general unifiers. The particular strat- 
egy for tackling constraints is unimportant, as the order in which constraints are 
solved does not make a difference to the result. Implementations are free to make 
alternative choices, provided all constraints are eventually dealt with. Of course, 
since vectors of equations arise from telescopes, it will usually make sense to solve 
the leftmost equations first so that later equations become homogeneous. Indeed, 
the reference implementation always works on the leftmost problem for which 
progress can be made (see Appendix C.4.6, page 235). 

Lemma 4.15 (Generality of problem decomposition). If Q \ T h Pwf, the meta- 
substitution 6:Q, ? VT. P C ©' is a solution and P Q, then ^:0,?Vr. Q C 0'. 

Proof By case analysis onPl^> Q, supposing that 6 (VT. P) is solved and showing 
that 6 (VT. Q) is solved. For details, see Appendix D.3.3 (page 247). □ 

Theorem 4.16 (Generality). If Qo h mctx, the metasubstitution #:©o E ©' is 
a solution and 0 O i->- ©i then there exists a cofactor ( :©i C ©' such that 9 = (-l. 

Proof By induction on the step taken, using Lemma 4.15 for problem decompo- 
sition. For details, see Appendix D.3.3 (page 248). □ 
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Figure 4.17: Pattern fragment 



4.3.4 Partial completeness 

As I observed in the introduction, full higher-order unification is undecidable, so 
the algorithm is incomplete in general. I will show that it is complete for the 
static Miller pattern fragment, where all metavariables are applied to distinct 
bound variables, assuming it terminates. It goes beyond the pattern fragment 
in handling E-types, and postponing non-pattern problems in case they become 
solvable later. I believe that it handles a sufficiently broad class of problems to 
be useful for elaboration of a dependently typed language. 

A term t is in the pattern fragment if, for every evaluation context of a 
metavariable a ■ e in t, e consists solely of projections and applications to distinct 
variables. This is captured by the judgment t pat defined in Figure 4.17. The 
definition could be extended to allow projections of variables, provided they are 
distinct in an appropriate sense. For technical reasons in the completeness proof, 
the result type of an if-expression cannot contain metavariables. A problem is in 
the pattern fragment if all the terms it equates are in the pattern fragment. A 
metacontext is in the fragment if all its hypothesised problems are. 

To show partial completeness, I will prove that the algorithm can always take 
a step unless the metacontext is already solved or it contains a contradiction. A 
metacontext is failed if it contains 1 as a hypothesised problem. 

Lemma 4.17. Suppose 0 is a well-formed metacontext in the pattern fragment 
that is not solved or failed. Then 8 h> 0' for some 0' in the pattern fragment. 

Proof. By considering the structure of the first unsolved problem in 0, demon- 
strating that at least one step of the algorithm must apply. The heterogeneity 
invariant means that twins or heterogeneous problems must have provably equal 



84 



types, and for the first unsolved problem, Corollary 4.10 implies that they must be 
defmitionally equal. Hence heterogeneity will not prevent progress. For details, 
see Appendix D.3.4 (page 249). □ 



Theorem 4.18. If Q is a well-formed metacontext in the pattern fragment, and 

0 i — y* 0' such that no more steps apply, then 0' is solved or failed. 

Proof. Follows immediately from Lemma 4.17: if 0' were not solved or failed, 
then the algorithm could take a step. □ 

4.3.5 Towards a proof of termination 

Intuitively, it seems obvious that the algorithm terminates: each step makes the 
metacontext simpler, either by decomposing a unification problem into smaller 
components, by solving a metavariable, or by replacing a metavariable with one 
or more metavariables of smaller type. 

However, it is difficult to construct a termination ordering. The conventional 
approach is to define a measure on the sizes of terms and types in the context, 
then show that each step of the algorithm reduces the measure. Abel and Pien- 
tka (2011) exhibit a suitable ordinal-based measure to show termination of their 
algorithm for LF. 

The picture is more complex for the full-spectrum dependent type theory 

1 have outlined, thanks to the presence of large elimination and metavariables 
standing for types. Defining a metavariable that occurs in a type can result in 
types becoming larger, which is not the case in LF. It is thus not clear how to 
calculate the size of a metavariable. If one takes the supremum over all possible 
instantiations of a metavariable when calculating its size, then splitting up in- 
habitants of S-types by step (4.21) does not strictly decrease the measure in the 
resulting ordering. 

Any proof of termination will need to take account of the stratification of 
the type theory. Obviously, if the underlying theory is not strongly normalising 
then encoding a divergent term can result in non-termination of unification. How- 
ever, in an inconsistent system even simpler non-termination is possible. Suppose 
our type theory included the axiom that there is a type of all types, sometimes 
written Set : Set. Martin-L6f (1975) had to abandon this axiom after Girard 
demonstrated its inconsistency. Now consider the context 

a:SX:Set.X,?a « (EX: Set. X, a). 
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As a has a S-type, a reasonable step is to split it into its components, giving 

P : Set, 7 : ? 7) « (EX : Set. X, 7)). 
Now the equation can be decomposed into 

Set, 7 :/?,?/? « EX:Set.X,? 7 « (/3, 7 ) 
and solving /3 := EX: Set. X yields 

7 :EX:Set.X,? 7 « (EX:Set.X, 7 ) 

which is the original problem. Applying the unification algorithm is therefore not 
guaranteed to terminate, in the presence of the Set : Set axiom. 

The lack of a termination proof for the unification algorithm (applied to the 
correctly stratified version of the theory) is rather unsatisfactory, and it is left as 
an open issue for future work. 6 It should be possible to stratify the proof in the 
same manner as the theory, demonstrating termination for small problems, then 
extending the result to the full theory with large eliminations. 

4.4 Discussion 

I have presented an algorithm for higher-order dynamic pattern unification in a 
full-spectrum dependent type theory. The approach to problem solving in this 
thesis, based on representing metavariables and problems in an ordered context, 
allows careful control over dependency and makes it easy to suspend work on one 
problem while the algorithm tries to solve another. 

The algorithm is optimised for clarity rather than performance, and I have not 
considered its algorithmic complexity. A 'real' implementation would probably 
need to use a representation of terms with more control over depth of evaluation, 
rather than working solely with /35-normal forms. Some care is also necessary to 
determine when to attempt each step: the reference implementation uses a fairly 
naive approach, recording the fact that no more steps apply to a given problem, 
but not the conditions under which this will change. Thus every problem must 
be examined again whenever a substitution changes its type. Similarly, rather 
than repeatedly checking to see if the types of metavariables can be simplified, 

6 Termination of higher-order unification can be surprisingly subtle: Dowek et al. (1996) 
describe a pattern unification algorithm for which termination can fail, as Reed (2009b, §5.1.1) 
explains. The algorithm I have described is at least not vulnerable to the same counterexample! 
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as in the reference implementation, projections could be eliminated only as they 
arise in unification problems. 

In this chapter, I described unification for a very restricted type theory, but the 
algorithm can be extended to support inductive types, proof irrelevance and other 
advanced features. It therefore forms the base on which to build an elaborator 
for a full-spectrum dependently typed language, in the style of Agda or Epigram. 

However, it is now time to take a different tack. In the second part of this 
thesis I will describe an extension of Haskell with dependent types. Underlying the 
elaboration algorithm for this language, as described in Chapter 7, is a constraint 
solver that makes use of the techniques for unification and type inference described 
in this chapter and those that preceded it. 
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Part II 

Haskell with dependent types 



Chapter 5 



The inch language: adding 
dependent types to Haskell 

Modern Haskell's poorly-concealed support for dependent types is increasingly be- 
ing used to obtain correctness guarantees for Haskell programs (McBride, 2002). 
From the ubiquitous vectors, to well-scoped A-terms and more exotic examples, 
dependent types allow programmers to express their intentions more precisely. 
However, many of these experiments are testaments to the versatility of gener- 
alised algebraic datatypes, multi-parameter type classes, functional dependencies 
and type families, rather than practical programming techniques. In particular, 
working with type-level numbers and teaching arithmetic to a compiler is a com- 
plex, inefficient business; the syntax is ugly, error messages are convoluted and 
typechecking is sometimes difficult to predict. 

Wouldn't it be nicer if we could write programs like the following? 

data Vec :: * — > N — > * where 
Nil :: Vec a Zero 

Cons :: a — > Vec a n — > Vec a (Sue n) 

append :: Vec a m — > Vec a n — > Vec a (m + n) 

append Nil ys = ys 

append (Cons x xs) ys = Cons x (append xs ys) 

replicate :: n (n :: N) — > a — > Vec a n 

replicate Zero _ = Nil 

replicate (Sue n) x — Cons x (replicate n x) 

The inch language presented in this part extends Haskell with dependent func- 
tions (n-types), promoted datatypes (including the integers), type-level arith- 
metic operations and integer constraints. This is not just an attempt to turn 



Haskell into Agda or similar full-spectrum dependently typed languages. A clear 
account of the phase distinction and the operational behaviour of programs is 
needed. Working in a weaker system enables more powerful type inference. More- 
over, the equational theory of arithmetic is not just /3-reduction: programming 
with dependent types can be made easier by automatically solving constraints 
that depend on algebraic properties (such as the commutativity of addition). 

This chapter consists of an overview of related systems (including those based 
on current features of GHC) and an informal introduction to the syntax and 
features of inch by means of examples. Following this introduction to the high- 
level language, I will define a corresponding language of evidence in Chapter 6. 
Typechecking the evidence language is straightforward, and it is suitable as an 
intermediate language during compilation. It is very explicit (for example, all 
type abstractions and applications must be present in the syntax), so information 
omitted from inch programs must be inferred when producing the corresponding 
evidence program. This translation, called elaboration, is the focus of Chapter 7. 
I will demonstrate larger examples of the use of inch in Chapter 8. 

The description of elaboration develops the approach to the Hindley-Milner 
system studied in Chapter 2. I will not study constraint solving in detail, but 
the unification algorithm for abelian groups in Chapter 3 and the higher-order 
unification algorithm in Chapter 4 demonstrate the basic ideas. 

5.1 Related work 

No idea exists in a vacuum. In this section, I will summarise the ideas and pre- 
decessor systems on which inch is based, including the current state of Haskell as 
implemented in GHC, and more distantly related work. In the following section, 
I will lay out the key features of inch, comparing it to these systems as I do so. 

5.1.1 Full-spectrum dependently typed languages 

In full-spectrum dependently typed languages such as Agda (Norell, 2007), based 
on Martin-L6f Type Theory (Nordstrom et al., 1990), arbitrary terms can be 
used to index types. Numbers can be modelled as an inductive datatype and 
mathematical operations defined on them by recursion. The type theory can 
be used to prove equations needed to make a program type check. There are no 
limitations on the form of numeric expressions (to linear functions or polynomials, 
for example), since the only automatic constraint solving arises from computation 
(/3-reduction) when checking definitional equality. 
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Suppose we have the following standard definitions (in Agda syntax): 

data N : Set where 
Zero : N 
Sue :N^N 

Zero + n = n 

Sue m + n = Sue (m + n) 

data Vec (A : Set) : N ->• Set where 
Nil : Vec A Zero 

Cons : V{n} ->■ A ->■ Vec An4 Vec A (Sue n) 

Vector concatenation is easily defined by recursion on the first argument, because 
the + function is also recursive on its first argument: 

_-H-_ : V{ A m n} ->■ Vec A m ->■ Vec A n ->■ Vec A (m + n) 
Nil -H- ?/s = 

Cons x -H- ?/s = Cons x (xs -H- ys) 

However, defining vector reverse is trickier, because + does not reduce if its first 
argument is neutral and its second is canonical. Consider the following: 

reverse : V{A m} — > Vec A m — > Vec A m 
reverse xs = help xs Nil 
where 

help : V{ A m n} ->■ Vec A m ->■ Vec A n ->■ Vec A (m + n) 

help Nil = 

help (Cons x xs) ys = help xs (Cons x ys) 

The definition of reverse is not accepted, because m + Zero ^ m, and the second 
line of help is not accepted, because m + Sue n ^ Sue (m + n). Instead, the user 
must insert explicit appeals to a proof of the commutativity of +. The equational 
theory of addition is not merely given by a recursive definition! 

In general, the user may need to prove many properties of the mathemati- 
cal operators they have defined. There has been some work on automating this, 
particularly via tactics in the interactive theorem prover Coq (Gregoire and Mah- 
boubi, 2005), but integrating this with programming can be difficult. 
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5.1.2 Dependent ML 

Xi (1998, 2007) describes Dependent ML (DML), a conservative extension of 
ML that supports "a restricted form of dependent types." Formally, DML is a 
language schema parameterised on a constraint domain C from which type indices 
are drawn. Type checking is reduced to constraint solving in C. Instantiating 
C with a language of arithmetic expressions results in a system for type-level 
numbers, but other choices are possible, such as the theory of free algebraic terms. 
Xi and Pfenning (1998) demonstrate one application of dependent numeric types: 
the safe elimination of runtime array-bounds checks. 

The development of DML lead Xi and coworkers to design the Applied Type 
System (ATS) framework (Xi, 2004) and the ATS language (Chen, 2006). 

Dependent ML is a major inspiration for this work, but extending Haskell 
with dependent types and type-level numbers requires more than adapting Xi's 
work to another syntax. While DML extends ML with a fixed domain of indices 
and constraints, I show how extensions to the Haskell kind system allow indexing 
by arbitrary type-level expressions, and I focus on the introduction on n-types, 
which are not supported by DML. 

One feature of DML that is absent from inch is support for effects. Since 
Haskell is more-or-less a pure language, with effects encapsulated in the 10 monad, 
there is no need for specific consideration of effects in the type system, nor for 
the value restriction. I will not discuss effects further in this thesis. 

5.1.3 Generalised algebraic datatypes 

Unlike normal algebraic datatypes, generalised algebraic datatypes (GADTs) al- 
low the return types of data constructors to specialise the indices of the datatype, 
by imposing additional equality constraints that must be satisfied on construc- 
tion and become available through pattern-matching. Thus they are a kind of 
inductive family indexed by type-level expressions. By defining suitable type 
constructors, numerically indexed types can be approximated, for example: 

data Zero Type 

data SucType :: * — > * 

data VecGADT :: * — > * — > * where 
NilGADT :: VecGADT a ZeroType 

ConsGADT :: a ->■ VecGADT a n ->■ VecGADT a (SucType n) 

The GADT translation replaces each expression in an index of the result type 
with a variable, and imposes an equality constraint between the variable and the 
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original expression. This gives: 

NilGADT :: V a m.m ~ Zero Type =>- VecGADT a m 
ConsGADT :: Va m n . m ~ SucType n =>- 

a ->• VecGADT a n ->■ VecGADT a m 

The idea for GADTs dates back to a draft by Augustsson and Petersson (1994), 
although the closely related inductive families (Dybjer, 1994) have a much longer 
history in the dependent types community. A variety of names have arisen for 
essentially the same concept. Early theoretical treatments of GADTs were given 
by Xi et al. (2003) (under the name guarded recursive datatypes) and Cheney 
and Hinze (2003) (as first-class phantom types). They were later studied by 
Sheard and Pasalic (2008) (as equality- qualified types), Simonet and Pottier (2007) 
(as guarded algebraic datatypes) and Peyton Jones et al. (2006) who christened 
them GADTs. Much subsequent work has gone in to finding good type inference 
algorithms, especially in the presence of other advanced type system features, 
and they are well-supported in recent versions of GHC. 

Moving beyond the free algebraic equational theory of type constructors, the 
associated type families (Chakravarty et al., 2005) extension to GHC allows type- 
level functions to be defined. For example, addition can be given thus: 

type family m + n 

type instance ZeroType + n = n 

type instance SucType m + n = SucType (m + n) 

A similar approach is possible using multi-parameter type classes and functional 
dependencies (Jones, 2000). In both cases, however, the type-level programming 
is effectively untyped (as all types have kind *). There is nothing to stop one 
forming the type Z + Bool, or even declaring such nonsense as 

type instance Z + Bool = Z 
5.1.4 Haskell libraries 

McBride (2002) showed that 'faking it' is a viable technique for simulating nu- 
meric dependent types in Haskell, including type-safe vector operations and ma- 
trix multiplication. Subsequently, there have been numerous implementations of 
type-level numbers as libraries using existing features of Haskell. The Hackage 
repository includes the packages sized-types, type-level, type-level-numbers, 
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type-level-natural-number, numtype and undoubtedly some with more equiv- 
ocal names! Kiselyov (2005) discusses several possible encodings including a 
particularly ingenious decimal representation, and manages to get a long way 
without using any extensions to Haskell 98. 

Many of these libraries have arisen in response to the need for type-level 
numbers in a particular application. For example, the sized-types library is 
part of Kansas Lava (Gill et al., 2009), a DSL for hardware description. The 
ForSyDe project (Acosta, 2008) uses fixed-size vectors as part of a DSL for mod- 
elling computation using signals and processes. Eaton (2006) describes a linear 
algebra library that provides static guarantees about the dimensions of vectors 
and matrices, ensuring compatibility when they are multiplied. 

Impressive as these libraries are, they are all hampered by the limitations im- 
posed by the language, in such areas as syntax, type inference and clarity of error 
messages. Better language support for type-level data would make it possible to 
move beyond these limitations and produce a more user-friendly system. 

5.1.5 GHC TypeNats 

Recently, Diatchki (n.d.) has developed an extension to GHC that supports type- 
level natural numbers, adding a new kind Nat. The choice of natural numbers 
rather than integers is motivated by the intended applications, such as measuring 
the sizes of datatypes, but there is no fundamental reason why the alternative 
choice could not be made. Of course, natural numbers are easily recovered from 
integers and an inequality constraint, but the reverse is not so easy. 

Work is underway to support arithmetic operations on natural numbers, in- 
cluding addition, multiplication and exponentiation. The plan is for them to 
be described by type families that trigger special behaviour in GHC's constraint 
solver (Vytiniotis et al., 2011). 

5.2 Features of inch 

Having described the giants on whose shoulders I am standing, I now give an 
overview of the inch language and compare it to its predecessors. The reader may 
wish to consult Chapter 8 alongside this section, for more extensive examples of 
the use of some of these features. 
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5.2.1 Down with kinds 



The Haskell kind system has been expanding for some time. From including just 
the kind * and function spaces, it has grown to encompass 'promoted' datatypes 
and kind polymorphism, as described by Yorgey et al. (2012). Promotion allows 
arbitrary algebraic datatypes to be used as kinds. For example, 

data Nat = Zero | Sue Nat 

allows Nat to be used as a kind, and its constructors to be used as types, as in: 

data Vec :: * — > Nat — > * where 
Nil :: Vec a Zero 

Cons :: a — > Vec a n — > Vec a (Sue n) 

This is a significant improvement on the essentially untyped type-level program- 
ming that is otherwise required with GADTs. However, the class of types that can 
be promoted is somewhat limited. In particular, GADTs cannot themselves be 
promoted. This prevents indexing a type by a GADT, which is necessary for more 
advanced dependently typed programming. For example, it is not straightforward 
to extend the traditional GADT example of well-typed terms in the simply-typed 
A-calculus so that contexts are represented by vectors. The following is rejected, 
because Vec cannot be promoted: 

data Elem :: Vec k n —> k — >■ * where 
Top :: Elem (Cons a v) a 
Pop :: Elem v a — > Elem (Cons b v) a 

data Tm :: Vec * n — > * — > * where 
Var :: Elem v a — > Tm v a 
Lam :: Tm (Cons a v) b — >■ Tm v (a — ^ b) 
App :: Tm v (a — > b) — > Tm v a — > Tm v b 

Now the kind system has algebraic datatypes, function spaces and polymorphism, 
so it increasingly resembles the type system, or at least the type system without 
recent extensions. Why not simplify matters by removing the distinction between 
types and kinds? This eliminates the boundary between 'promotable' and 'non- 
promotable' datatypes. It is a conceptual simplification, because users do not 
need to learn two slightly different type systems, and it means that type and kind 
checking become the same operation, which may reduce the burden of specifying 
and implementing the compiler. 
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Weirich et al. (2013) show that this identification of types and kinds gives a 
perfectly good intermediate language. They adopt the typing rule * : *, rather 
than a hierarchy of universes, because logical soundness of the system is not a con- 
cern. Haskell has general recursion anyway! On the other hand, type soundness 
(progress and preservation) is retained. The inch system follows their approach. 

ML lacks higher kinds (all types are of kind *), so DML distinguishes between 
types and indices, with the latter having a sort drawn from the underlying con- 
straint language C It adds a single index to each datatype, and ensures types 
appear only applied to an index value. Multiple indices can be supplied as pairs. 

Integrating type-level data into a single type and kind system, as in inch, gives 
a great deal of extra expressivity. For example, the type of reflexive transitive 
closures of binary relations on a can be defined in general, then specialised: 

data RTC :: (a — > a — > *) — > a — > a — > * where 
Embed :: r m n — > RTC r m n 
Reflexive :: RTC r n n 

Transitive :: RTC r I m — >■ RTC r m n — > RTC r I n 

DML supports polymorphism over type indices, but since parametric polymor- 
phism in ML is restricted to types of kind *, a separate quantifier is needed. It 
uses LT a : 7 . r for the universally quantified type of elements of r polymorphic in 
an index of sort 7. Since this is a type, it can appear on the left of an arrow, 
effectively permitting a limited form of higher-rank polymorphism. Unlike the 
usual notion of a dependent function space (Il-type) in type theory, this construct 
is parametric: the value a is not available at runtime and the function cannot 
eliminate it by case analysis. It thus corresponds to V in inch. 

5.2.2 Dependent functions 

How might the replicate function, which repeats a value n times to produce a list, 
be extended to return vectors? The type of the resulting vector depends on the 
integer argument, so the argument must be known statically (available during 
typechecking), but the operational behaviour of the function also requires it, so 
it must be available at runtime. It really requires a dependent n-type: 

replicate :: n (n :: N) — > a — > Vec a n 

replicate Zero = Nil 

replicate (Sue n) x — Cons x (replicate n x) 

The variable n is bound in the range of the n-type, just as for a universally 
quantified type scheme, but it also appears in the patterns defining the function. 
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An alternative to introducing explicit H-types is connecting the term and 
type levels using singleton types. In this approach, a family of types is indexed 
by type-level representations of term-level data, so that each type has a single 
inhabitant. In Haskell with GADTs and datatype promotion, the example can 
be expressed using a singleton type SingNat thus: 

data SingNat :: Nat — > * where 
SingZero :: SingNat Zero 
SingSuc :: SingNat n — >■ SingNat (Sue n) 

replicateSing :: SingNat n — > a — > Vec a n 
replicateSing SingZero _ = Nil 

replicateSing (SingSuc n) x = Cons x (replicateSing n x) 

Converting between the representations requires additional functions. Here a 
higher-rank function has been used to convert the runtime Nat into the singleton 
SingNat; an alternative approach is to use existential types (see Subsection 5.2.3). 

forget :: SingNat n — > Nat 
forget SingZero = Zero 
forget (SingSuc n) = Sue (forget n) 

remember :: Nat — > (Vn . SingNat n — > t) — > t 

remember Zero / = / SingZero 

remember (Sue n) / = remember n (/ o SingSuc) 

There is some duplication and redundancy inherent in this approach, since term- 
level data must be re-expressed at the type level, but some of this can be taken 
care of by the compiler. Monnier and Haguenauer (2010) show how to convert 
from the Calculus of Constructions into a non-dependent language with singleton 
types. The Strathclyde Haskell Enhancement (McBride, 2010b) supports defining 
the type-level copy and singleton GADT for an algebraic datatype automatically, 
and the singletons library of Eisenberg and Weirich (2012) goes even further 
than this, using Template Haskell to automatically convert sufficiently simple 
term-level functions into type families. 

Dependent ML uses singletons, rather than n-types in the sense above. 

5.2.3 Dependent existential types 

A key feature of DML is its support for dependent existential types, allowing (for 
example) the type of lists to be replaced by vectors of existentially quantified 
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length. This is useful for abstraction purposes, or when the invariants being 
maintained are difficult to express at the type level. For example, the length of 
the list returned by filter depends on how many elements satisfy the predicate, 
and rather than building this into the type, another option is to return a vector 
of existentially quantified length, with a type like 

filter :: (a — > Bool) — > Vec a n — > 3 m. Vec a m. 

This is a powerful but complex feature, as the combination of parametric polymor- 
phism and existential dependent types significantly complicates type inference. 
An alternative, introduced by Laufer and Odersky (1992) and used in Haskell, is 
to associate existential values with data constructors, closing the existential pack- 
age when data is constructed and opening it when pattern-matching. A variable 
is existentially quantified if it does not appear in the parameters associated with 
its constructor. I use this option for inch. It is less flexible than genuine exis- 
tential types, as in DML, but it is also significantly simpler for type inference 
purposes and is familiar to Haskell programmers through its support in GHC. 

Xi (2007) argues that connecting existential types with data constructors leads 
to a need for too many datatypes with slightly different constraints, and Chen 
(2006, p. 23) further suggests that "indirect support to existential types is sim- 
ply impractical in the presence of dependent types", using the example of the 
singleton family of integers in DML. However, higher-kinded and higher-rank 
polymorphism ameliorate the problem to an extent, as does native support for 
n-types rather than using the singleton encoding. For example, the datatype 

data Ex :: (k — > *) — y * where 
MkEx :: / x ->■ Ex / 

allows any singly-indexed type to be converted into an existential. It can safely 
be eliminated via rank-2 polymorphism: 

unEx :: Va / . (Vx ./ x — > a) — > Ex / — > a 
unEx g (MkEx x) = g x 

Admittedly, the usual problems with argument order for higher-kinded types will 
arise: Ex (Vec a) is conveniently the type of vectors of existentially quantified 
length, but if its arguments were reversed, Ex (Vec n) would be the rather less 
useful type of vectors of fixed length but unknown element type. In general, a 
small amount of bureaucratic constructor shuffling may be necessary, but this 
seems reasonable given the complications of type inference for existentials. 
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5.2.4 Implicit and explicit arguments 

When should it be possible to omit the argument of a function? Milner (1978) 
achieved a remarkable coincidence, as Lindley and McBride (2013) observe: 

Syntactic category Types Terms 

In source language Implicit Explicit 

Abstraction Dependent (V) Non-dependent (— >) 

Runtime Erased Present 

So neat is this coincidence that one may forget to distinguish these concepts. 

However, as more advanced type systems have been developed, Milner's co- 
incidence has been stretched. On the positive side, Wadler and Blott (1989) 
introduced typeclasses, a system of implicit term-level arguments that are not 
erased at runtime. More negatively, current GHC sometimes insists on playing a 
frustrating guessing game, where it does not allow a type-level argument to be 
specified but tries to reconstruct it by unification, which is not always possible. 
That is, there are implicit static arguments that would be better made explicit. 

For example, consider the following definitions: 

type family F a 

/::Ffl->Fa 

/ x = x 

j::Fa-}Fa 

9 = f 

The definition of g is rejected by GHC even though its type is syntactically 
identical to that of /, because it helpfully freshens a to a 0 , then fails to solve for 
the original a since F might not be injective. 1 

A folklore trick often used to solve this problem is to declare a 'proxy type' 
with a single phantom parameter. This allows an extra argument to be added to 
each function where the type should be passed explicitly, annotating the proxy 
constructor appropriately: 

data Proxy (a:: k) = Proxy 

/' :: Proxy a-^Fa-^Fd 

/' _ x = x 

g'v.Vb.F F b 

g' = f (Proxy :: Proxy b) 

L In fact, GHC even rejects g without a type signature, presumably because it tries to rechcck 
the type it has inferred and hits the same problem. 
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While this provides a workaround for the problem, it is quite invasive, as the 
original definition of the function needs to be changed, and it is syntactically 
noisy. The ability to write type application explicitly in source Haskell is long 
overdue; the only major stumbling block is deciding upon the concrete syntax. 

On the other hand, it is often desirable to omit arguments that can be re- 
constructed mechanically. This does not necessarily correspond to the type- 
level/term- level or compile-time/runtime distinctions: runtime terms may well 
be inferred if the types determine them. This issue has been studied extensively 
in the setting of dependently typed programming languages, in particular by Pol- 
lack (1990). A common approach is to allow certain arguments of functions to 
be designated implicit, 2 with the idea that they will be found automatically dur- 
ing type inference (typically by unification). For example, in Agda the replicate 
function can be written 

replicate : {a : Set} {n : N} — > a — > Vec a n 

replicate { n = Zero} _ = Nil 

replicate {n = Sue n} x = Cons x (replicate x) 

Now n is implicit by default at use sites, since it can usually be inferred from the 
context, even though it is critical for the runtime behaviour of the function. This 
is a big win: the compiler is writing operationally relevant parts of the program! 

Implicit arguments are written in curly braces in the type, and may be omitted 
by default in patterns and expressions, or specified by wrapping them in curly 
braces. Both positional and named variants on the notation are available. In 
{n = Sue n}, the first n specifies the implicit argument to match, and the 
second is a binding occurrence. The fact that an implicit argument can always be 
specified explicitly if necessary avoids the problems discussed above. Agda-style 
notation would allow a much neater solution to the problem discussed above: 

fy.Vb.F F b 

9" = f{a=b} 

In Section 7.1, I will show how inch supports implicit argument notation. It 
adopts a slight generalisation of the Haskell syntax for quantifiers in types: a dot 
following the binder means the quantification is implicit, while an arrow means the 
quantification is explicit. Thus V a .r and n (n :: N) . v are implicitly quantified, 
while V(a :: *) — > r and n n — > v are explicitly quantified. For applications, the 
Agda-style named implicit argument notation is used, as in / {a = b}. 

2 Implicit arguments are not the same as the 'implicit parameters' of Lewis et al. (2000), 
which are a construct for dynamic scoping. 
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Implicit II-types 

Typeclasses provide a form of term-level implicit arguments for Haskell. Along 
with the singleton encoding, this allows an approximation of an implicit H-type: 

class ImplicitNat (n :: Nat) where 
sing :: SingNat n 

instance ImplicitNat Zero where 
sing = SingZero 

instance ImplicitNat n ImplicitNat (Sue n) where 
sing = SingSuc sing 

A class context containing ImplicitNat n means that n is passed implicitly. It is 
meaningful even though an obvious induction shows that the predicate holds for 
every canonical Nat. An implicit version of replicate can be defined thus: 

replicatelmplicit :: ImplicitNat n =>- a — > Vec a n 
replicatelmplicit = replicateSing sing 

However, there are now three variations on a single type (Nat, SingNat and 
ImplicitNat), all of which must be understood by the programmer. Moreover, 
switching between explicit and implicit arguments is clumsy: sing :: Sing x must 
be used in place of a simple x. 

Implicit n-types are often useful in class instances. For example, in order 
to make Flip Vec n a monad, the length n must be supplied at runtime so that 
replicate can be used in the implementation of return: 

newtype Flip f x y = Flip { unFlip :: / y x} 

instance n (n :: N) . Monad (Flip Vec n) where 
return = Flip o replicate n 

Flip xs >■=/ = Flip (help xs (unFlip of)) 
where 

help :: Vec a m — > (a — > Vec b m) — >■ Vec b m 
help Nil g = Nil 

help (Cons x xs) g = Cons (vhead (g x)) (help xs (vtail o g)) 
5.2.5 Type-level numbers 

I have already shown several examples of the use of type-level natural numbers in 
measuring the lengths of vectors. For many applications involving measuring the 
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sizes of datatypes, natural numbers suffice, and they have an obvious inductive 
definition from zero and successor constructors, as shown above. Most Haskell 
libraries for type-level numbers use naturals, as does the TypeNats extension to 
GHC (Diatchki, n.d.). 

However, another choice is available: the integers. This increases expressivity 
as natural numbers can be recovered from integers using inequality constraints 
(see Subsection 5.2.7). DML (Xi, 2007) takes this choice. 

The design considerations for a language extension are different to those of 
a library. There is no restriction to ad-hoc type-level programming techniques. 
Type inference may be easier for integers, because they form an abelian group, 
allowing the unification algorithm from Chapter 3 to be used. 

Moreover, there are some use cases that rely on negative as well as positive 
integers, such as implementing a library for units of measure. Given a fixed set 
of base units, a derived unit can be represented by its integer exponents: for 
example, metres per second (m/s) could be represented by 1 as the exponent 
of metres, —1 as the exponent of seconds and 0 as the exponent of other base 
units. The NumType library of Buckwalter (2009) is one of the few libraries to 
support negative numbers for exactly this reason. In Chapter 8, units of measure 
are developed using the inch system. 

Zenger (1997) describes a Haskell-like language with types indexed by poly- 
nomials over the complex numbers. Grobner basis techniques can then be used 
to solve constraints. This is an interesting choice of constraint domain, but does 
not quite match most of the examples, which expect integers or natural numbers. 
This may lead to overly permissive type-checking (if constraints with no integer 
solution can be solved in C) or failures to deduce desired properties (for example, 
n > 0 does not imply n > 1). 

The prototype implementation of the inch language supports a kind Z of 
integers, plus a kind N of natural numbers that is treated as syntactic sugar for 
Z with an inequality constraint. I will focus on the addition of n-types, rather 
than numeric constraint solving, however. 

5.2.6 Supported operations 

Closely connected to the choice of numbers to represent is the signature of oper- 
ations that are available on them. Addition is a must for any nontrivial use of 
type-level numbers, even just appending vectors. If negative integers are permit- 
ted, then subtraction is also useful. If not, it is less clear what meaning (if any) 
to give subtraction; though there are several options (Runciman, 1989), perhaps 
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it is easiest to require types to be rewritten to avoid it. 

With just addition (and perhaps subtraction) one can express multiplication 
by constants and many useful linear properties, while remaining within the theory 
of Presburger arithmetic. This theory is decidable (Presburger, 1930, translated 
by Stansifer (1984)), so complete constraint solving is feasible. It should not be 
dismissed out of hand, as many useful examples can be expressed in this fragment. 

Diatchki's TypeNats extension includes addition, multiplication and expo- 
nentiation on natural numbers, but omits their (partial) inverses. This leads to 
interesting challenges in designing a suitable constraint solver that is powerful 
enough to handle common constraints but also allows the user to supply proofs. 

Xi's constraint solver for DML handles only linear constraints, though his 
formalism allows for more complex numeric expressions, and he mentions the 
possibility of postponing nonlinear constraints in the hope that they will become 
linear and hence solvable. In his subsequent work on the ATS programming 
language (Xi, 2004), he argues for the combination of programming and theorem 
proving to allow the user to supply proofs of difficult constraints. 

5.2.7 Constraints 

When working with GADTs or type families, it is frequently useful to add equality 
constraints to qualified types; indeed GADTs are implemented using equality con- 
straints on constructors that are made available by pattern-matching. Similarly, 
equality and inequality constraints are useful for type-level numbers. 

The encoding of propositional equality in type theory (Nordstrom et al., 1990) 
can be translated into Haskell thus (writing (~) for built-in equality constraints): 

data Id m n where 
Refl :: m ~ n =>- Id m n 

elimEq :: V a m n .Id m n — >■ (m ~ n a) — > a 
elimEq Refl x = x 

One could abstract a over an index in the definition of elimEq, giving 

elimEq' ::\/ (a :: t — > *) m n . Id mn^-am^-an 
elimEq' Refl x = x 

but since Haskell's type-level function space lacks first-class A-abstraction, it is 
easier to work in the former style, using equality rather than abstraction. 

A decision procedure that produces a witness to the equality can be given by 
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decideEq ::U (m n :: Z) — > Maybe (Id m n) 

or even 

decideEq' ::U (m n :: Z) — > Either (Id m n) (Id m n — >■ Void) 

where negation is expressed as a function to the type Void with no constructors. 
This encoding of negation is not entirely satisfactory in a non-total language, 
however, since all types are inhabited. 

Alternatively, a function to compare two integers can be given a rank-2 type: 

ifEq :: II (m n :: Z) — > (m ~ n =>- a) — >■ a — )• a 

In the third argument, the assumption that m and n are equal is available to 
the typechecker. The kind of continuation-passing style demonstrated by ifEq 
is frequently useful to introduce additional hypotheses or eliminate existential 
type variables, showing the need for a system that integrates type-level data with 
arbitrary-rank polymorphism. Of course, it also makes use of Haskell's laziness 
and the corresponding ease of writing control operators. 

Going beyond equalities, inequality constraints (<, <,>,>) are useful in order 
to express weak bounds. For example, they allow safe projection from a vector: 

index :: V(m :: N) . II (n :: N) — > n < m =>- Vec a m — ^ a 

index Zero (Cons x xs) = x 

index (Sue n) (Cons x xs) = index n xs 

Similar techniques can be used to create a safe array library that eliminates 
runtime bounds checks, as Xi and Pfenning (1998) taught us. 

When used in a quantifier, the natural number kind imposes a constraint on 
the bound variable: II (n :: N) . t translates to II (n :: Z) . 0 ^ n =>• t. This is 
similar to (though simpler and less expressive than) DML's notion of a 'subset 
sort' (Xi, 1998), which allows a new sort to be formed by restricting an existing 
sort with some constraints. 3 

Learning by testing 

A crucial feature for working with type-level data is the ability to perform type- 
refining dynamic tests, enabling "learning by testing" (Altenkirch et al., 2005). 
Dependently typed programming languages typically exploit dependent pattern 

3 A DML sort is similar to a Haskell kind, but restricted to terms in the index language C. 
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matching and techniques such as views (McBride and McKinna, 2004). Depen- 
dent pattern matching is supported by inch, as in the replicate example. 

A small extension to Haskell's notation for guards is useful. I use curly braces 
to mark a guard, written in the constraint language, that refines the type of the 
corresponding branch. For example, the ifEq function can be implemented as: 4 

ifEq :: II (m n :: Z) — > (m ~ n =>- a) — >■ a — >■ a 
ifEq m n x y \ {m = n} = x 
| otherwise = y 

The runtime behaviour of such expressions is straightforward: drop the curly 
braces to obtain the usual guard. If-expressions can be handled in a similar way. 

Helping the constraint solver 

Given an incomplete constraint solver, what can the user do if a program is 
rejected because a true constraint was not solved by the system? Sometimes it 
may be possible to extend the type signature by quantifying over the additional 
constraint, requiring callers to prove it; eventually a caller may be reached that 
supplies concrete values for variables, so the constraint is easily checked. However, 
in some cases it may not be possible to quantify over the required hypothesis, for 
example if the function pattern-matches on a GADT introducing local constraints. 

One possibility is to supply additional information to the typechecker using a 
higher-rank function. For example, a term for commutativity of multiplication 

commutes :: V(m n :: Z) — > (to * n ~ n* m =>- a) — >■ a 

would allow the user to write commutes m n x in place of an expression x that 
depends on the assumption m * n ~ n* m. The quantification over m and n is 
explicit, even though they are erased at runtime. This is necessary because the 
typechecker will not be able to choose appropriate arguments. 

A trusted library of properties could be implemented as 'unsafe' coercions. 
If the variables were available at runtime (quantified over by n rather than V), 
such properties could be 'proved' by writing a recursive function to perform the 
necessary induction, but in a partial language this function must be executed at 
runtime in order to ensure type safety, which is likely to be undesirable. 



Here ~ is the equality type constraint, = the runtime equality test and = the Haskell 
syntax for a definition! 
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Chapter 6 

A language of evidence 



In this chapter, I describe the evidence language, suitable as an intermediate 
language for a Haskell compiler. The next chapter will describe how to elaborate 
inch terms into evidence terms. The language presented here is based on System 
Fc, the core language of GHC 1 , with modifications inspired by dependent type 
theory to support the new features of inch and make the presentation uniform. 

One reason for compiling via an intermediate language, rather than directly 
to a low-level language, is to ensure correctness. It is analogous to the use of an 
easily checked kernel type theory in a proof assistant such as Coq. Typechecking 
intermediate language code is straightforward, as expressions encode their own 
typing derivations, and everything is fully explicit. Terms can be checked after 
elaboration and during optimisation, leading to early detection of compiler bugs. 

A key inspiration for this chapter is the work of Weirich, Hsu, and Eisenberg 
(2013). Like them, I adopt the dangerous-sounding rule * : *, so the kind of types 
classifies itself. To a dependent type theorist, this instantly suggests paradox, 2 
but the system will permit general recursion at the type level in any case, so the 
potential paradox is irrelevant. There is no hope of proving strong normalisation 
in general, but the usual subject reduction and progress properties are maintained. 
The system does include a logic of equality, and this must be kept consistent, 
which can be achieved by keeping it weak. Coercions encode the exact amount 
of computation to be done, so there is no risk that typechecking an evidence 
term will fail to terminate. Moreover, "the point of writing a proof in a strongly 
normalizing calculus is that you don't need to normalize it". 3 There is no need 
to compute coercions, whereas if coercions could be bogus, they would need to 
be normalised before being relied upon to coerce values. 

1 System Fc has developed over time; the main versions are discussed in Subsection 6.7.3. 
2 The 1971 type theory of Martin-L6f (1975, 1998) was inconsistent for this reason, 
saying of Randy Pollack, quoted by Altenkirch et al. (2005). 



The main feature that the evidence language adds to previous versions of 
System Fc is H-types, allowing types to depend on a limited fragment of 'shared' 
runtime expressions. To enable a compact presentation of the system, I abstract 
over the possible 'phases' of quantification and typing judgments, and write a 
single set of typing rules covering both types and terms. This highlights the 
common structure and avoids repetition. For example, a single application rule 
replaces a multitude of rules for applying one sort of expression to another. 

Moreover, a single syntax and type system for type and term-level constructs 
allows them to have a common operational semantics, in the usual style of depen- 
dent type theory. This is a fundamental difference in perspective from System 
F c . It leads to the replacement of type families (that are axiomatically defined 
and lacking operational behaviour) with honest-to-goodness case analysis. Type- 
level functions are then mere recursive definitions. There is no A-abstraction at 
the type level, and type-level functions must be saturated (fully applied), so the 
language of types is essentially first-order and elaboration is as simple as possible. 

Unlike type families, type-level functions as I define them do not support case 
analysis on types or the open world assumption. The two are not necessarily 
mutually exclusive. One could certainly imagine a system in which type families 
and true type-level (or shared type- and term-level) functions are both available. 

In Subsection 5.2.2 (page 96), I gave the example of the replicate function: 

replicate :: II (n :: N) — > a — > Vec a n 

replicate Zero _ = Nil 

replicate (Sue n) x — Cons x (replicate n x) 

This uses its natural number argument both statically (as it occurs in the type) 
and dynamically (for pattern-matching at runtime). It can be seen as a single 
shared function that makes sense at the type level and the term level. 

For comparison, here is the same thing implemented using a type family and 
term-level singletons, the alternative to Il-types discussed in Subsection 5. 2. 2. 4 

type family Replicate (n :: N) (x :: a) :: Vec a n 

type instance Replicate Zero _ = Nil 

type instance Replicate (Sue n) x — Cons x (Replicate n x) 

replicateSing :: SingNat n — > a — > Vec a n 
replicateSing SingZero = Nil 

replicateSing (SingSuc n) x = Cons x (replicateSing n x) 

4 The Replicate type family is rejected by GHC 7.6, because it involves a promoted GADT. 
It is forbidden by the system of Weirich et al. (2013), which does not permit the result kind of 
a type family to depend on its arguments, but this may not be a fundamental restriction. 
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The type family version can be defined directly on the kind of natural numbers, 
but the term-level version must use a singleton copy to pattern-match at runtime. 
The connection between the term and type- level functions has been lost. 

In the sequel, I introduce the syntax of the evidence language (6.1), discuss 
the key role that phase distinctions play (6.2) and give the type system for the 
language (6.3). I then define its operational semantics and prove subject reduction 
(6.4). Proving progress takes a little more work (6.5). Finally, I define a runtime 
erasure operation that removes types and coercions (6.6) and conclude with a 
discussion of possible extensions, related systems and future work (6.7). 

6.1 Syntax 

In this section, I present the syntax for the evidence language. It may be worth 
skipping quickly through this on first reading, and returning to clarify details of 
the syntax. Figure 6.1 shows the naming conventions in use in this chapter. 

Figure 6.2 gives the syntax of signatures and contexts. The signature £ con- 
tains global top-level symbols that may appear in expressions, including type 
constructors D, data constructors K, functions / and axioms C. The context or 
telescope T, A contains variables bound locally. Contexts will later be generalised 
to metacontexts 6, which include metavariables for use in elaboration (discussed 
in Chapter 7). 

The common syntax of expressions is shown in Figure 6.3. Unifying the syntax 
avoids redundancy, as there are unique forms for abstraction, application and 
quantification, and it simplifies the operational semantics. In Section 6.2, I will 
explain the use of phases \& to distinguish the different roles of types, coercions 
and terms. Saturated function applications f(5) are syntactically distinguished 
from normal application. 

For the sake of familiarity, Figure 6.4 gives subgrammars of p for type ex- 
pressions t,v,k, coercions 7,77, runtime terms e and shared terms e. Variables 
are accounted for by a single production but I will frequently write a, b for type 
variables, c for coercion variables and x, y, z for term variables. Propositions tp 
9X6 9 subgrammar of types that represent quantified equations. 

De Bruijn (1991) showed that working with telescopes of bindings A, and 
vectors of expressions 5 corresponding to them, rather than single bindings and 
single substitution, is often a significant simplification. I write for a vector 
containing type expressions. 
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a, b 


type variable 


K 


kind 


c 


coercion variable 


A 


abstraction 


e 


expression 




value type 


f 


function 


P 


expression 


i,j, k, I 


, m, n integer 


T, V 


type 


r 


erased runtime term 


V 


proposition 


V 


value expression 


i> 


vector of type expressions 


x,y,z 


term variable 


UJ 


telescoped coercion 


C 


coercion axiom 


r,A 


context (telescope) 


D 


type constructor 


A 


abstraction 


H 


rigid constructor 


IT 


dependent function space 


K 


data constructor 


E 


signature 




coercion 


T 


type phase 


5 


vector of expressions 




phase 


e 


shared term 


n 


non-type phase 


l 


identity substitution 







Figure 6.1: Naming conventions 



E ::= • | E,D :* k | E, K :* k | E, C P up \ E, / [A] :* k \ E, / [A] =p :* k 

r, A ::= • | r,o :* r 

$, * ::= V | n | □ | A 

T ::= V | IT 

Q ::= □ | A 

Figure 6.2: Grammar of signatures, contexts and phases 
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a 

P*P' 
H 

m 

p>7 
Q 

(d)casepof br\ 



Aa: 



K . p 



expression 
variable 
application 
quantification 
constructor 
saturated function 
type cast 
coercion evidence 
case expression 
abstraction 



D 

K 

* 

R 



rigid constructor 
type constructor 
data constructor 
kind of types 
equality type 



C 

resp u At 
left 7 
right 7 
conga T 7 rj 
conga D 7(771, 772) 
cong $ rj 7 
7@?7 

cob.7 rj 

kind 7 
step p 



coercion evidence 

axiom 

congruence 

left injectivity 

right injectivity 

congruence of T application 

congruence of □ application 

congruence of quantification 

congruence of T instantiation 

congruence of □ instantiation 

coherence 

equality of kinds 

computation step 



5 

(d)case 

br 



= -\S,p 

= ■ I w, (t,t',7) I u, (p,p') 

= dcase | case 

= KA^p 



Figure 6.3: Grammar of expressions 
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r, v, k ::= type expression (phase V) 

a variable 

T*p application at phase <3> 

(a:*/?)— )■ t quantification 

H constructor 

f(5) saturated function 

r > 7 type cast 
(d)case r of br\ % case expression 

7, 77 ::= coercion (phase □) 

c variable 

7 > 77 cast 

g coercion evidence 

A a : *k . 7 proof abstraction 

e ::= runtime term (phase Jy) 

x variable 

e*p application at phase <3> 

fT data constructor 

f(5) saturated function 

e > 7 type cast 

(d)case eof 6r/ case expression 

Aa:*K. e abstraction 

e ::= shared term (phase II) 

x variable 

5*p application at phase $ 

K data constructor 

f(5) saturated function 

e > 7 type cast 

(d)case£of case expression 



y? ::= (~)«i r 2 | (a:*/c) ->■ (p 
ip ::= • I ip, T 



Figure 6.4: Subgrammars of type expressions, coercions and terms 
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6.2 Phase distinctions and promotion 



Existing work by Yorgey et al. (2012) on System Fq, which extends System F c 
with type-level data, is based around the idea of 'promoting' datatypes to the 
kind level and data constructors to the type level. By a fortuitous coincidence, 
some terms turn out to be well-kinded type expressions, but there is no formal 
relationship between well-typed terms and well-kinded types. Not all datatypes 
can be promoted, since the kind system is more restrictive than the type system, 
although work is underway to change this (Weirich et al., 2013). 

Adding Il-types to a system with F^-like promotion is possible, adding yet 
more abstraction and application forms, and another typing judgment. However, 
factoring out the common structure makes the relationships between the phases 
clear. This is particularly true when it comes to the operational semantics: rather 
than trying to juggle separate rules for runtime terms, shared terms and type 
expressions, I can instead give a single system that covers them all. Of course, 
the purpose of the phase distinction is maintained: type expressions and coercions 
are erased at runtime, as discussed in Section 6.6. 

The evidence language distinguishes between phases given by 

$, \1/ ::= phase 



There is a single typing judgment, annotated by the phase at which it holds. 
Phases occur on quantifiers, A-abstractions and context bindings, to indicate the 
phase at which variables are bound, and on applications, to indicate the phase 
of the quantifier. This means that the typing rules have a single rule for each 
construct, rather than a whole host of similar rules. It is not essential to unify 
these concepts; one might choose to present the phases separately. The □ phase 
must sometimes be distinguished, in order to ensure it remains consistent. In 
particular, it cannot admit case analysis or recursive functions. 

The phase annotations on typing judgments will justify the subgrammars 
given in Figure 6.4, as whenever an expression p is well-typed at phase <3>, it will 
belong to the subgrammar corresponding to $. 

The single syntax for quantifiers (a:*r) — > v subsumes universal quantifica- 
tion and the runtime function space: Wa:r.v becomes (a: v r) — > v and r — > v 



V 



n 



static phase (universal quantification) 
shared phase (dependent product) 



□ 




A 



runtime phase (function space) 
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becomes (x:^t) — > v. The latter is never dependent, however, as the typing rules 
ensure x cannot be used in v , so I will often write the familiar syntax instead. 

Similarly, the single syntax for abstractions Aa: *r . p subsumes A-abstraction 
over terms and A-abstraction over type expressions. Again, I will write the more 
familar \x : r . e instead of Ax : . e, but this is merely syntactic sugar. Abstrac- 
tions may occur only at phase J\ or □: there is no type-level A-abstraction. 

6.2.1 The access policy 

The fortuitous coincidence that some terms are also well-kinded type expressions 
now turns into a solid metatheoretic property: all well-typed shared terms are 
both well-typed runtime terms and well-kinded type expressions. The 'access 
policy' relation $ > \1> expresses when things at one phase can be used at 
another. This is a partial order, defined by the following Hasse diagram: 5 



v A 




□ n 

The typing rule for variables (see Figure 6.6) 

T h ctx r 3 a :* k $ ^ 

rh«:* K 

uses this relation: any variable bound at phase $ is accessible at phase A key 
result (Lemma 6.4) extends this to show that if a typing judgment holds at phase 
$, and $ then it holds at phase vj>. 

6.2.2 Promoted data constructors 

Where does promotion fit in to this system? The constructor Just has type 
(a : v *,x :^ a) — > Maybe a, so it seemingly expects a static and a runtime 
argument. We want to be able to use it at the type level with static arguments, 
so that Just * Bool has type Maybe*. Thus the application rule 

rhp:* (a:*Ki)->K2 r hp' :*^* Kl 
r h pV :* [p'/a] k 2 

5 So II ^ V and II ^ \, while □ is lonely. 
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calculates the phase $ / \I/ at which to check the argument from the phase $ of 
the quantification and the phase \I/ at which the expression is being checked. The 
relativisation operator <3> / pronounced '$ for is defined by 



// 


A 


n 


V 


□ 


A 


A 


n 


V 


V 


n 


n 


n 


V 


V 


V 


V 


V 


V 


V 


□ 


□ 


□ 


□ 


□ 



When checking a runtime term, the phase of the typing judgment is X, and 
^ / A is j us t 3>i so arguments to runtime functions must be of the phase stated 
in their type. However, when checking in a static context, the argument must 
be known statically. This causes implicit promotion: K//^ = ^ means that 
Just* : v (x :^ *) — > Maybe* can be applied to Bool : v *. Since | V and 
J\ $ / V for all $, there is no way that a variable at phase X can be used in a 
type expression at phase V. 

I will usually omit the annotation on applications, writing p p' instead of 
since it is easily recovered from the type of p. It is useful for defining erasure as 
an operation on syntax rather than on typing derivations in Section 6.6. 

6.2.3 Promoted functions 

The (+) function is useful in terms, but appears also in the type of append 
for vectors. Therefore, the evidence language introduces a new style of 'shared' 
functions, which may occur in types and terms. 

Shared functions may appear as arguments at phase II, so type safety will 
require that reduction (in the operational semantics for shared terms) implies 
propositional equality (in the language of coercion proofs). An easy way to achieve 
this is to give a consistent operational semantics at all phases, rather than the 
different semantics of term-level functions and type families possible in Haskell. 
The operational semantics will be given in Section 6.4. 

Crucially, shared functions applications f(5) must be saturated, to distin- 
guish function application from normal application. This retains the injectivity 
of type-level application from System F c , and avoids introducing type- level A- 
abstractions, which would complicate type inference. 

The signature E contains function declarations / [A] k that record the 
phase of the function, the telescope A of arguments, and the resulting type k, 
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which may depend on A. For example, the (+) function is declared at phase II 
(because it can be used in runtime terms and in type expressions) with telescope 
x :^ N, y :^ N and result type N. I will write x + y instead of (+) (x ,y). 

Function definitions / [A] = p :* k are separate from declarations, because 
the body p may call / recursively. They are expanded eagerly, with a call-by- 
name semantics, and any pattern matching must be performed by explicit case 
expressions (as discussed in the Subsection 6.2.4). 

Since functions are not guaranteed to terminate, they may not appear at 
phase □, which needs to be kept consistent. This means that type safety will 
not depend on strong normalisation of functions used in types, although they 
might lead to non-termination of type inference for the source language, just as 
with type families in Haskell. Of course, it is possible to impose conditions that 
guarantee termination for a class of programs, as in Agda. 

Consider the type of the function 

vsplitAt :: Va (n :: N) .11 (m :: N) — > Vec (m + n) a — > (Vec m a, Vec n a) 
vsplitAt Zero xs = (M\\,xs) 

vsplitAt (Sue m) (Cons x xs) = (Cons x ys, zs) 
where (ys, zs) = vsplitAt m xs 

This type applies the function (+) to arguments at phases V and II respectively, 
building a result at phase V. As with promoted constructors, this is possible due 
to the relativisation operator, applied to the function's telescope by the rule 

S 3 f [A] :* k rh<5:A//^ 
r h f(S) :* [5/ A] k 

Phases act on telescopes, written A / \P, thus: 

(A, a :*«)/* ^ (A/*),o:(*^*)/c 

This operation will also be used in the typing rule for dependent case branches, 
so the arguments to the constructor will be available statically. 

6.2.4 Dependent case analysis 

The replicate and vsplitAt functions rely on dependent pattern matching: case 
analysis on the N argument establishes that the result is type-correct. That is, it 
allows 'learning by testing' (Altenkirch et al., 2005). Recall the replicate example, 
reformulated to use a dependent case expression: 
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replicate :: Va :: * . II n :: N — > a — > Vec a n 
replicate n x = dcase n of 
Zero ->■ Nil 

Sue m — > Cons x (replicate m x) 

In the Zero branch, Nil needs to have type Vec a n; this is possible because a local 
constraint n ~ Zero is brought into scope. Similarly, in the Sue branch, a local 
constraint n ~ Sue m is available. In general, each branch can make use of the 
information that the scrutinee is equal to the matched constructor. 

This resembles a GADT pattern match (see Subsection 5.1.3, page 92). Indeed 
the singleton construction makes use of GADTs to encode dependent pattern 
matching. The crucial difference is that here the constraint is not an implicit 
argument to the data constructor, as with GADTs, but is separately brought into 
scope by the dependent case expression. 6 

The dcase construct of the evidence language supports dependent case anal- 
ysis. In the typing rule for dependent case branches, an additional variable is 
brought into scope: a proof that the scrutinee is equal to the matched construc- 
tor. The scrutinee must be well-typed at phase V, since it will appear in an 
equality type. This is ensured by checking it at phase II // \l> where \I/ is the phase 
of the case expression; the access policy gives II // \& > V. A non-dependent case 
construct is also available, allowing runtime expressions to appear as scrutinees. 

Thus the body of replicate could be translated into the evidence term 

dcase n of 

Zero (c : D n ~ Zero) — > Nil a n c 

Sue (to : n N, c : D n ~ Sue m) — > Cons a n m x (replicate (a, m, x)) c 

where the types of the constructors, after the GADT translation, are: 

Nil : (a : v *, n : v N, c : D n ~ Zero) ->■ Vec a n 
Cons: (a : v *,n : v N, m : v N, 

x :^ a, xs :^ Vec a m, c : D n ~ Sue m) — > Vec a n 

The mechanism for reconstructing the implicit arguments will be discussed in 
Chapter 7. Note that the recursive call to replicate uses an alternative syntax, 
with a comma-separated vector of arguments, to emphasise the fact that it is a 
fully-applied shared function (see Subsection 6.2.3). 

6 Of course, a GADT may appear as the scrutinee type in a dependent case expression. 



116 



6.3 Type system 



The evidence type system consists of the following judgments: 

E h sig E is a valid signature 

T h ctx r is a valid context 

r h p :* k, p has type k at phase ^ in context V 

r h 6r :* v ► r 6r is a case branch with scrutinee type t>, result type r 

r h br :* (e : -u) ► r 6r is a dependent case branch, scrutinee e : f, result r 

T h <5 : A <5 is a vector in A 

T h tc oo : A w is a telescoped coercion with domain A 
All judgments except E h sig are implicitly parameterised by a signature E. 

6.3.1 Well- formed signatures and contexts 

Figure 6.5 defines the signature and context well-formedness judgments. These 
check that each declared name is fresh (written #) and well-typed in the appro- 
priate sense, and that it is introduced at suitable phase. The signature E contains 
global top-level definitions: type constructors D, data constructors K, functions 
/ and axioms C. The context T binds variables. 

Type constructors are always static, whereas data constructors may be static, 
dynamic or shared (but not proofs). A Haskell-style datatype declaration corre- 
sponds to a single type constructor and a number of data constructors. System 
F c encodes datatypes in the same way, although my use of telescopes A to collect 
type and term bindings represents a slight simplification. For GADT data con- 
structors, the return type is an application of the type constructor to variables, 
but the telescope will include constraints on the variables. 

As discussed in Subsection 6.2.3, functions / are separated into declarations 
/ [A] :* k and definitions / [A] = p :* k, with the declaration appearing before 
the definition in the signature, in order to permit general recursion. They have 
a telescope A of parameters, which the result type n may depend on. Function 
applications will always be saturated (written f(S) where 5 is a vector in A). 

Axioms C : n ip assert that all closed instances of the proposition (p hold. For 
example, the proposition (a : v N, b : v N) — > (a + b) ~ (b + a) asserts that addition 
is commutative, but this fact is not otherwise derivable as a coercion (because 
the proof language does not permit induction) . Adding this as an axiom makes it 
available when generating evidence for equalities. Since the exact form of proofs 
is unimportant, much like in Observational Type Theory (Altenkirch et al., 2007), 
any consistent axiom may be added without affecting computation. 
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E h sig 



(E is a wa/z'd signature) 



E h sig D#E E h sig K#E $ ^ □ 

a f : v h ctx di y Ki\ A h Daj 1 : v * 

• h sig E,D : v (a~i : v k/') ^ * h sig E, K :* (a,- : v k^A) D 07* h sig 

/#E $ ^ □ E 3 f [A] : $ k 

E h sig A h k : v * E h sig A h p :* k 

EJ[A] : $ K h sig E,/[A]=p:*Kh sig 

E h sig C#E • h : V * 
E, C : D (/? h sig 



T h ctx (T is a valid context) 

s h sig a#r rh/t: v * $ ^ □ c #r r h : V * 

• h ctx r, a :* k h ctx r, c : D </? h ctx 



Figure 6.5: Validity of signatures and contexts 

In contexts, the validity rules require that the type of each variable is well- 
kinded. They distinguish between coercion variables c and other variables a, 
because the type of a coercion variable must be syntactically a proposition <p 
rather than an arbitrary type k, for technical reasons in the consistency proof. 

6.3.2 Well-typed terms 

Figure 6.6 defines the expression typing judgment r h p :* k, meaning that p is 
an expression of type k when checked at phase The same judgment is given 
additional rules in Figure 6.7, as discussed in the next subsection. The variable, 
application and function rules were introduced in Section 6.2. 

Type constructors D and data constructors K are available as declared in the 
signature. In addition, there are two built-in constants: * (the kind of types) 
and heterogeneous equality (~). I will write the equality relation infix, using the 
syntactic sugar introduced in Subsection 6.3.5. 

The rule for casts p > 7 explicitly changes the type of p using the coercion 7. 
This replaces the conversion rule, which would prevent decidability of typecheck- 
ing since type expressions are not strongly normalising. Casting a proof uses a 
separate rule, described in the next subsection. 
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K 



(p has type k at phase ^) 



T h ctx 

T 3 a : k 



r h ctx 

£ 3 D : v k 



r h ctx 

£ 9 K :* 



K 



rh«: w K 

E9/[A]:* K 
T h 5 : A // ^ 



T h D : 



.v 



K 



T h K :* 



K 



r h p :* (a:*/ti) — > «2 
r h :* [<f/A] « r h pV :* [ P 7«] ^ 



* ^ □ 



r h k : v * 
r,ffl:*fihr : v * 



ctx 



r h ctx 



:* «/ 



r h * 



r h p>7 

T,a :* re hp : n r 
rh Aa:*K.p : n (a:* k) ^ r 



r h 



;a: v *) ->■ (6: v *) ->• a ->• & ->■ 



rhp:% * ^ □ 

r h 6r 0 :* v ► r ... T h 6r n :* v ► r 



T h casepof 6r 0 ... &r n :* r 



The r 11 //* 



* ^ □ 



T h 6r 0 :* (e : u) ► r ... T h 6r n :* (e : v) ► r 
r h dcaseeof br 0 ... br n :* r 



r h br :* v ► r 



(for is a well-typed case branch) 



£ 9 K : ( fli : v /t f , A) Da7 



r h r : v * 



rhK([^/a,-']A)^p :*D^ l ^r 



r h for :* (e : u) ► r 



(for is a well-typed dependent case branch) 



£ 9 K : (a,- : v Ki , A) -»• DaT 

A' = [^M i ]A//n,c: D £~(K^ i A) 



r,A'hp :* r Thr 



$ ^ n / * 



ThKA'^p :*(e: D^ 1 ) ►r 



Figure 6.6: Typing rules 
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Abstractions Aa : ®K.p are available at phases f2 e {A, □}, but may not 
appear directly in types (at phase V or IT). 

Case expressions were discussed in Subsection 6.2.4. They may not occur in 
proofs, since nontermination might result. Case branches are checked using two 
auxiliary judgments, r h br :* v ► r and Y \- br :* (e : v) ► r, meaning that 
the branch br matches a scrutinee of type v and returns an expression of type r 
at phase $. The second judgment makes an extra assumption, that the scrutinee 
is equal to e, available in the branch. 

6.3.3 Well-typed coercions 

Figure 6.7 adds rules for well-typed coercions to the typing judgment of Figure 6.6. 
Thus variables, applications and abstractions are available for coercions as well 
as other expressions. Coercions have a specialised version of the cast rule 7 > 77, 
which ensures that the result of the cast is syntactically a proposition ip f . 

The coercion syntax includes the general-purpose congruence rule respw A t, 7 
making various structural rules derivable by asserting that [t7/A] t ~ [of /A] r. 
In particular, it means that reflexivity, symmetry, transitivity and congruence for 
dynamic functions are all derivable rules, as shown in Figure 6.9. 

Making congruence an explicit coercion form avoids the need to prove its 
admissibility (called the 'lifting theorem' in previous work on System F c ) and 
reduces the number of structural rules required. The system is proof-irrelevant 
so the exact form of the coercion language is unimportant. The formulation 
given here is not general enough to prove the congruence rules for application 
(conga T 7 rj and conga D 7 (%, r] 2 )), quantification (cong <£> 77 7) and case analysis 
(cong (d)case7?77 J ), so these must be present explicitly. 8 

The congruence rule for case analysis relies on the auxiliary definitions in 
Figure 6.8 for computing the equality proposition between two case branches. The 
operation A /A A' combines two telescopes that bind corresponding variables, but 
may assign them types that are only propositionally equal. It produces a single 
telescope that quantifies over variables of both types and a proof of their equality. 
Equality between two case branches br pa br' takes the proposition that the branch 
results are equal and quantifies over the combined telescope. 

Just like in System F c , injectivity rules left 7 and right 7 allow decomposition 

7 It is sometimes useful to optimise coercions (such as replacing a coercion whose subterms 
are all reflexive with a direct appeal to reflexivity). This is possible with the resp formulation, 
but may be easier if all the structural rules are introduced separately. 

8 A more general congruence rule, allowing local parameterisation in the telescope, could be 
used to remove these. 
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r h 7 : D if (7 has type if at phase D) 



rh tc w: A r,Ahr: V K r h 7 : D t t' ^ vv' 

rhrespwAr : D [to /A] r ~ [uf/A] r Y h left 7 : D r ~ v 

rh 7 : D rT'~W Th 7 : D (( 0l :*«x) n) ~ ((a, :*« 2 ) r 2 ; 



T h right 7 : D r' ~ v' Y h left 7 : D «i ~ k 2 

T h 7 : D («!-»• n) ~ (k 2 ->■ r 2 ) T h ctx E 9 C : D f 

r h right 7 : D n ~ r 2 rhC:> 

T h 7 : D (n : (ai : T «i) ->■ «i) ~ (r 2 : (a 2 : T k 2 ) -»■ «' 2 ) 
r h 77 : n 1 : «i) ~ (v 2 : « 2 ) 

T h conga T 7?] : D (tiI>i) ~ (r 2 t; 2 ) 

T h 7 : D (n : (ci : D y?i) ->■ «i) ~ (r 2 : (c 2 : D if 2 ) ->• k 2 ) 

r h 771 :° gi r h 7? 2 : D y? 2 

r h conga D 7 (771, 772) : D (n 771) ~ (r 2 772) 

r, a\ : «i h Ti : * T, a 2 : /t 2 h r 2 : * r h 77 : «i ~ k 2 

T h 7 : D : T «i, a 2 : T /c 2 , c : D a x ~ a 2 ) ->■ Ti ~ r 2 

T h cong T 77 7 : D ((01 : T «i) ->■ n) ~ ((a 2 : T « 2 ) ->• r 2 ) 

T, ci : D </?i h ri : V * T, c 2 : D v? 2 h r 2 : v * 

r h 77 : a (f 1 ~ (/9 2 T h 7 : D (ci : D ifi, c 2 : D f 2 ) ->■ n ~ r 2 

T h cong □ 777 : D ((ci: D y?i) ->■ n) ~ ((c 2 : n v? 2 ) ->■ r 2 ) 

r h 7 : D e ~ e' r h 770 : D 6r 0 w 6tq . . . Y h r/ n : D br n « 6r' n 

T h (cong (d)case7 77i i ) : D ((d)caseeof ftr/) ~ ((d)casee'of 6r'/) 

rh 7 : D ^ rh 7 : D (( ai : T Kl )^r 1 )~((a 2 : T / t2 )^r 2 ) 

r h 7/ : D (/9 ~ if' Y h 77 : D (vi : «i) ~ (i> 2 : « 2 ) 

r h 7 > 77 : a y?' rh 7@7/ : D [1*1/ ai] Ti ~ [v 2 / a 2 ] t 2 

Y h 7 : D (( Cl : D </?!) -)• n) ~ ((ca : D f 2 ) -)• r 2 ) T h 7 : D (n : «i) ~ (r 2 : « 2 ) 
r h 7/! : D y?! T h 7/ 2 : n y? 2 rhT/ : D Kl ~v 

r h 7@ (771,772) : D [rji/ci] Ti ~ [772/C2] r 2 rhcoh7 77 : D n > 77 ~ r 2 

r h r : v K rhr': v K r gg> r ; r h 7 : D (n : Kl ) ~ (r 2 : k 2 ) 

T h stepr : D r ~ r' Y h kind 7 : D «i ~ k 2 



Figure 6.7: Well-typed coercions 
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(K A -> r) « (K A' -> r') ^ ((A /X\ A')) (r ~ r') 



• /X\ ■ ^ 

r, a : T re /A r, a' : T re' ^ 

r.iiM r , x' : n r' ^ 



r Mr, a : T re, a' : T re', c : D a ~ a' 
r A\ r , x : n r, x' : n r' 



Figure 6.8: Evidence for equality of case branches 



r h 7 : D ip (derivable rules: 7 has type ip at phase n) 

r h r : v re Th 7 : D (^reQ - (r 2 :re 2 ) 

rh(r) : D r~r rhsym 7 : D (r 2 :re 2 ) ~ (nrrei) 

r h 71 : D (ti:«i) ~ (r 2 :re 2 ) 

T h 7 2 : D (r 2 : re 2 ) ~ (r 3 : re 3 ) r h 77 : D n ~ r 2 T h 7 : D i>i ~ t> 2 



T I- 7i;72 : D (ri:«i) ~ (r 3 :re 3 ) T h cong Xvi -° (Tl ->■ ~ (r 2 ->• u 2 ) 
T h 7 : D (ri : rei ->■ fi) ~ (r 2 : re 2 ->■ t> 2 ) 

rh:° (r{:«i)~(^:K2) T h 7 : d Hrf -Ht^' 



T h conga A 7?7 : D (n rj) ~ (r 2 r 2 ) T h nth* 7 : n r,- ~ t>,- 



(r) H> resp ■ • r 
syni7 i-> (n) > resp ((rei, re 2 , kind 7), (ti,t 2 ,7)) (a : v *, & : v a) (6 ~ n) 
7i5 72 ^ 7i> resp ((re 2 ,re 3 , kind 7 2 ),(r 2 ,r 3 ,7 2 )) (a : v *, b : v a) (n ~ b) 
congJ^Vl resp ((n, r 2 , 77), (t^, t> 2 , 7)) (a : v *, & : v *) (a ->• 6) 

conga^7 7] 1— )• resp ((rei, re 2 , left (kind 7)), (vi, t> 2 , right (kind 7)), 

(ri,r 2 ,7), {t[,t^t])) 
(a : v *, b : v *, x : v (a — >■ b), y : v a) (x y) 
nth* 7 1 — y right ( left . . . left 7) where 77 * and t)j ! have n elements 



n—i times 



Figure 6.9: Derivable rules for coercions 
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of an equation between applications or non-dependent function spaces. Instanti- 
ation rules 7@?7 and 70(771,772) play a similar role for dependent quantifications. 

As in the work of Weirich et al. (2013), heterogeneous equality uses the '£- 
interpretation' in which an equation between expressions of different kinds implies 
that the kinds themselves are equal. This is witnessed by the kind 7 coercion. 
Also present in their work and Observational Type Theory is the coherence rule 
cob_7 77, which states that casts do not change the identity of an expression. 

New in the evidence language is the step r rule, making a redex equal to its 

reduct. Thus the operational semantics, given by the > relation to be defined 

in Section 6.4, is embedded in the propositional equality. The presence of step 
constructors means that the computation necessary to typecheck a term is finite. 

6.3.4 Vectors and telescoped coercions 

Figure 6.10 gives the rules for vectors and telescoped coercions. A vector S con- 
tains expressions that can be substituted for a telescope A. Thus each expression 
in the vector must be checked at the appropriate phase, with the type determined 
by substituting for the preceding telescope. 

Equality of types (~) extends to equality of vectors. A telescoped coercion 
u represents two vectors (to and ~uf) along with proofs of equality for the type 
expressions they contain. Thus it consists of pairs of type expressions plus a co- 
ercion between them (r , v , 7), and pairs of terms (e , e') or coercions (7 , 7'). 
No proof of equality is needed for runtime terms because they cannot appear in 
types; no proof is needed for coercions because the system is proof- irrelevant. 

6.3.5 Syntactic sugar 

Some convenient abbreviations are given in Figure 6.11. Just as System Fc for- 
mally distinguishes between type, term and coercion application, so applications 
carry a phase, but this is easily recoverable from the type of p, so I will 
usually omit it. The presence of phase annotations on applications allows the 
erasure operation (Section 6.6) to be defined on the syntax of terms, rather than 
on typing derivations, but it is otherwise harmless to omit the annotations. I will 
write the application of an expression to a vector p 5. 

Since dynamic variables cannot occur in type expressions, thanks to the phase 
distinction, the function space (x :^t) — > v may be written r — > v, as there is 
no possibility of x occurring in v. The familiar notation Xx : t . e is used for 
inhabitants of this function space. 
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r h 5 : A 



(5 is a vector in A) 



T \- 5 : A 

r h ctx r h p :* [5/ A] re 



r h • : • r h (5, p) : (A, a : k) 



T \- tc cj : A (u is a telescoped coercion in A) 



r h tc u : A r h 7 : D r ~ t> 

T h ctx T h r : T [t7/A] re r h v : T [at/A] re 



Th tc - : • rh te ( W ,(r, M )) : (A, a : T re) 

rh tc w : A r h tc w : A 

rh)] : D [tj/A]<p Th e : A [t7/A] r 

rh rj : D [at/A]p T h e' : A ["at /A] r 



rh te ( W ,(i 7 ,i/)) : (A,c: a y.) rh tc ( W ,(e,e')) : (A, x : A r) 



(t,v,j) H> at, 



(p, p'j H> aty 
Figure 6.10: Vectors and telescoped coercions 



p5 !->■ 



pp' h-> p $ p' where T h p :* (a:*r) — >■ v 

'p if 5 = • 

(p<5')p' if<5 = 5',p' 

r —i- v i->- (x: A r) — >■ v 

\x:t . e !->■ Ax-.^r . e 

(ti:ki) ~ ("r 2 :re 2 ) ^ (~)«i«2TiT 2 

ti ~ r 2 !->■ (~) rei re 2 ri r 2 where T h ri : v rei and T h r 2 : v re 2 

Figure 6.11: Syntactic sugar 
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6.3.6 Meta-theoretic properties 

I will now prove some results for working with telescopes, the usual weakening and 
substitution lemmas, and a more liberal form of the substitution lemma required 
for subject reduction. Where proofs have been omitted, they are by induction on 
derivations. Writing a vector of arguments instead of a single argument for an 
application is justified by the first lemma, which I will often use implicitly. 

Lemma 6.1. Suppose T h p :* (A) — > r. Then T h p5 [5 /A] r if and only if 
T\- 8 : A/$. 

Lemma 6.2. IfT h tc w : A then T h t7 : A and T h T$ : A. 

Lemma 6.3 (Weakening). Lei J &e an arbitrary judgment. IfT,T' h J and 
T, A h ctx where the variables in A and T' are distinct, then T, A, V h J. 

To prove the substitution lemma, I must show that judgments are preserved 
under phase increases following the access policy, as described in Subsection 6.2.1. 

Lemma 6.4 (Phase inclusion). Suppose $ 

fa; IfT hp :* k iaen r h p :* k. 

(fcj // r h 5 : A / $ taen r h 5 : A // 

fcj // T h tc w : A // $ taen T h tc w : A // 

Proof. By induction on derivations, following from the use of the access policy 
$ °->- ^ for the variable rule, the right-monotonicity of / for application, and the 
transitivity of <E> ^ \I/ for case analysis. □ 

Lemma 6.5 (Substitution). Suppose T \- 5 : A and let T' be a telescope. 

(a) If T, A, T h ctx then T, [5/ A] T h ctx. 

(fej // T, A, T h p :* k £/ien T, [5/ A] T h [5/ A] p :* [5/ A] re. 

(cj // r, A, r h 5' : A' £/ien r, [5/ A] T' h [<J/A] 5' : [5 /A] A'. 

(d) If T, A, r h tc w : A' then T, [5/ A] T h tc [<J/A] w : [5 /A] A'. 

Proof. By induction on derivations. The interesting case is for variables in A. 
Here S contains an expression that is well-typed at the phase of the variable, and 
Lemma 6.4 means it is well-typed at the phase at which the variable is used. □ 
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$jx_j/ ( checking types at phase $ may involve checking types at phase ^ ) 



$ oc * 

$ oc $ $ oc V $ oc 

Figure 6.12: Relevance relation 

To prove subject reduction in the presence of promotion, I will need a more 
liberal substitution lemma (Lemma 6.8), where the vector being substituted in- 
habits A / $ rather than A. This depends on the fact that if a typing judgment 
holds at phase <£>, then it still holds when the /$ operator is applied to part of 
the context. However, a straightforward inductive proof of this property fails, 
due to the phase change in the application rule. Instead, I must prove a more 
general property relating the phases involved (Lemma 6.7), using the 'relevance' 
relation $ oc \1/ defined in Figure 6.12. 

Lemma 6.6. // $ oc ^ and $' ^ then $' / $ ^ ^. 

Lemma 6.7 (Context for phase). Suppose <£> oc 

(a) If T, A, r' h ctx then T, A // $, V h ctx. 

(b) If T, A, r h p :* re tfien T, A // $, T' h p :* re. 

(cj // r, A, T' h <5 : A' // i/ien r, A // T' h 5 : A'// 

(dj //r,A,r h tc w : a'/w to r, a // r' h tc w : a'/*. 

Proof. By induction on derivations. In the variable case, if x :*' re G A and 
$' ^ then $' / $ ^ ^ by Lemma 6.6. Thus the variable rule still applies. 
For application at phase the argument is well-typed at phase $' // and 
$ oc $' / * by definition, so the result follows by induction. □ 

Lemma 6.8 (Substitution at phase). Suppose T h S : A // $ and /ix T'. 

(a J // T, A, V h ctx £/ien T, [5/ A] T' h ctx. 

(fej // r, A, r h p : $ re then T, [6/ A] V h [6/ A] p :* [5/ A] re. 

fcj // T, A, r h <5' : A' // $ £/ien T, [5/ A] V h [5/A] 5' : [5/ A] A'// $. 

fdj // T, A, F h tc u : A'// $ tfien T, [5/ A] V h tc [S/A] u : [5/ A] A'// $. 

Proof. In each case, Lemma 6.7 gives that T, A, V h J implies r, A / $, r" h J 
(by reflexivity of oc). Then the result follows from Lemma 6.5. □ 
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Each judgment has associated sanity conditions, giving admissible rules: 
Lemma 6.9 (Sanity conditions). Let X be the implicit signature. 

T h ctx implies X h sig 

r h p :* r implies T h r : v * and T h ctx 

T h 5 : A implies T h ctx 

r h tc w : A implies T h ctx 

Proof. By induction on derivations, using the preceding results. Consider the 
application rule as an example: 

r hp : (a:* K\) — > k 2 

r h p*p' :* [p'/a] K2 

Induction on the first premise gives T h (a:* K\) — > : v *, so T, a :* K\ h K2 : v * 
by inversion. Then Lemma 6.8 gives Y h [p'/ a] k 2 : v *. □ 

6.4 Operational semantics 

In this section, I will give a small-step operational semantics for expressions. 
The reduction rules are given in Figure 6.13. These are essentially the rules 
of System Fc (Sulzmann et al., 2007), with the addition of function definitions 
and dependent case analysis. The other novelty is that the rules apply to type 
expressions as well as terms. 

The syntax of values v and value types £ is: 

v ::= H5 \ (a:*/t) — > r \ Aa:*K. e 
i ::= H^|(a: $ K)^r 

A value type is a value that has kind * at phase V (so A-abstraction is excluded). 
In the usual System F c fashion, expressions reduce to values that may be wrapped 
in a coercion, so there are rules to push coercions out of the way when they would 
otherwise prevent reduction. Of particular note is the push rule for the scrutinee 
of a case expression, described in Subsection 6.4.1. 

The same rules apply to phases V, II and J\, but coercions (at phase □) 
are not evaluated. For type expressions, evidence that the redex is equal to 
the reduct may be required. The usual practice in dependent type theory is to 
build reduction into the definitional equality, but here there is no guarantee that 
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(p reduces to p' in one step) 

kpush 



P 



P 



p>r] — > p >r] 

kpush / 

: — — > e 



PP 



P'p" 



case p of br,j — > case p' of br-j " 
br' 0 = br 0 > step £ ... 6r' n = 6r„ > step e KA^pg&r,- 



dcaseeof br 0 ... br n — > dcasee' of br' 0 ... br' n case K ip 5 of br { — >[S/A]p 
KA^pebu 1 E 3 f [A] = p :* k 



dcase K^5of 6r; 8 — ► [(5, (K ^ <5))/A] p 



f{6) — ► [5/ A] p 



(Aa:*K. e)*p — >■ [p/a] e 



T h 7 : D ((oi : T Ki) n) ~ ((a 2 : T « 2 ) r 2 ) 
7o = sym (left 7) 71 = 7@(coh (r) 70) 

(v D> 7) T r — >■ v T (r > 70) > 71 



r h 7 -P ((d : D p x ) n) ~ ((ca : D p 2 ) r 2 ) 
70 = sym (left 7) 7l = 7 @(p > 7o , p) 

(v>7) D p — > v D (p>7o) >7i 



Th 7 : D ((ai: A «i)->n) ~ ((a 2 : A /t 2 ) ->• r 2 ) 
7o = sym (left 7) 7x = right 7 

(v > 7) A p — > v A (p > 70) > 71 



(v > 7) > 7' — >■ v > (7; 7') 



kpush , 
p > P ' 



Th 7 : u Dr. 

E 9 K :* (a* : v re/, A) -»• Da 



(p reduces to p' as the scrutinee of a case expression) 



u = (t,-, v i, nth 1 7) : a» : v re; -< 5 : A 



P — > P 

kpush / 



Figure 6.13: Operational semantics for shared terms 
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reduction will terminate, so explicit coercions are required to retain decidability 
of typechecking. The step coercion provides the necessary evidence: 



■pi V t-i l_ ' V kpush , 

r h r : K r r T '. K r > t 

T h stepr : D r ~ r' 

The second premise is only to ensure that the relevant sanity property, that r ~ r' 
is well-kinded, does not depend on subject reduction. 

Computing an expression can change the type of a surrounding construction 
to something provably equal but not syntactically identical. 9 For example, sup- 
pose / : (a: n r) — > v and p — > e, then f(p) : [p/a]v but f(e) : [e/a]v. It is not 
straightforward to construct the coercion manipulations required to preserve the 
type, especially where there is a telescope of arguments, though the resp con- 
gruence can be used to prove the required equations. I take a simpler approach. 
By giving a call-by-name semantics to shared functions and lifting case analysis 
to the type level, I avoid the need for reduction in an argument position. 

In the rule for dependent case analysis, when the scrutinee takes a step, the 
branches must be coerced so that they remain type correct, since their types 
depend on a proof that the scrutinee is equal to the relevant constructor. Define 
coercion of a branch br > 7 by 

(K (A, c : D e ~ K 5) — ^ p) > 7 ^ K(A,c' : D e' ~ KS) ->• [7; c'/c]p 

so that r h br :* (e : v) ► r and T h 7 : D e ~ e' implies T h 6r>7 :* (e' : v) ► r. 

6.4.1 The push rule for scrutinees 

Each push rule has a similar form: given an expression with a coerced value that 
blocks reduction, push the coercion deeper into the term. Coerced data construc- 
tors may block reduction if they appear as the scrutinee of a case expression, so a 
rule is needed to push the coercion inside the arguments of the data constructor. 
For example, suppose T h 7 : D Maybe Bool ~ Maybe a and consider the scru- 
tinee (Just Bool True) > 7. Pushing the coercion inside the arguments produces 
Just a (True > right 7), an applied constructor, so the case expression can reduce. 

The push rule for scrutinees is formulated as an extra reduction step, available 
when evaluating the scrutinee of a case expression, as shown in Figure 6.13. It 
is not available elsewhere as this would lead to nondeterminism: in particular, 
terms like (K > 7) p could reduce in two different ways. 

9 In Type Theory, definitional equality includes computation, so this problem does not arise. 
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Given a coerced data constructor Kt; 4 <5>7, where T h 7 : D DT~i' 1 ~ Dv7; 1 
and E 3 K :* (a, : v k { , A) — >■ D"a7 4 , each r« needs to be replaced with V{ and 
the elements of the vector 5 coerced appropriately. The telescoped coercion 

■ i i 

[ji , v,i , nth 4 7) is formed for a { : v , then extended by 5 in A to produce 

-. i i . v 

a telescoped coercion (Yj, v i, nth 4 7) , to in a« : v k { , A such that vl \ ~uf is the new 
vector of arguments for K. 

Recall that a telescoped coercion 00 represents two vectors in some telescope T, 
given by to and at, plus proofs that they are equal. If A is a telescope extending 
T and 5 is a vector in [t7/T] A, then a/ = u : T -< 5 : A is such that to, to' 
is a telescoped coercion in T, A, and 00' = 5. The telescoped coercion extension 
operation is defined thus: 

00 : T -< ■ : • i->- • 
00 : T -< (<5, r) : (A, a : T /t) H> a;', (r , r > 7, sym (coh (r) 7)) 

where a/ = a; : T -< 5 : A 
and 7 = resp (a;, a;') (T, A) k 
u:T~< (5, p) : (A, x : n t) ^ a;', (p, p > resp (a;, a;') (r, A) r) 

where a/ = a; : T -< 5 : A 

The point of this definition, upon which subject reduction will depend, is: 

Lemma 6.10 (Telescoped coercion extension). Suppose that V h tc ojo : Ao, 
T h tu~o, 5 : A 0 , Ai and uii = cu 0 : A 0 -< 5 : Ai . Then T h tc ojo, oji : A 0 , Ai . 

Proof. By induction on the definition of telescoped coercion extension. □ 
6.4.2 Subject reduction 

The point of all the work pushing coercions around is that subject reduction is 
easy to prove. It is enough to inspect the reduction steps and verify that each 
one preserves the type up to syntactic equality. 

Theorem 6.11 (Subject reduction). The operational semantics preserves types: 
ifV\- p :* r and p ^> p > then V h p' : $ r. 

Proof. By induction on the reduction step. I consider some illustrative cases. 
For the /3-reduction step 

(Aa: $ K. e)% — > [p/ a ] e 

inversion gives T h (Aa : *k . e) p :^ [p/a] r, so T h Aa : */t . e :^ (a k) — >■ r and 
r h p -® II ^ k. Then inversion on the first premise gives r, a :* k h e :^ r and 
substitution (Lemma 6.5) gives T h [p/a] e :^ [p/a] r as required. 
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If the scrutinee of a dependent case expression takes a step, its type is pre- 
served by induction, and the definition of coercion for case branches ensures that 
the whole expression is well-typed (by the substitution lemma). 

For the dependent case analysis step 

K A ->■ p e Wi 
dcaseK^offc^T — ► [(5, (K^ 5))/A] p 

from T h dcase K ip 5 of br{ % r inversion gives that T \- Kip5 : n ^* and 
T h bri :* (K^ 5 : D^) ► r. Suppose K has type (a,- : v /t/, A) — >■ D a^, then 
(2j K/j , A) / (n I $) by Lemma 6.1. Now substitution gives that 
r h 5, (K^5) : [VV«j : v k/] A/n/$, c : D ~ K^A. Inversion on the rule 

for case branches gives T, [ip/cij : v n/] A /II, c : D ~ K^A h p :* r and 

applying Lemma 6.8 gives T h [(5, (K^5))/A] p :* r. 
For the scrutinee reduction step 

£ 9 K :* K : v k/',A) DoT 

; i i 

u = (7$, V{, nth* 7) : di y Ki -< 5 : A 

: i % . % 

from T h tc (t,-, fj, nth 8 7) : a; : v «;,• and T h t^,<5 : a; : v , A / <&, Lemma 
6.10 gives T h tc (ri, fj, nth 4 7) , w : a; : v A / $. Hence Lemma 6.2 implies 
that r h U^, ut : Oi : v A/$ and so T h Ku^'at :* Df7 8 '. □ 

6.5 Consistency and progress 

To prove progress, I must demonstrate the consistency of closed terms in the □ 
fragment, as the existence of a coercion between dissimilar types would lead to 
stuck terms. For example, if • h 7 : D Bool ~ (Bool — > Bool) then (True > 7) False 
is well-typed but stuck. I will prove consistency as a corollary of a more gen- 
eral theorem, by defining a compatibility relation between type expressions that 
implies they have the same head constructor, and showing that provably equal 
expressions are compatible. Compatibility will require that closed expressions re- 
duce to head-normal forms with identical outermost constructors and compatible 
subcomponents, if they terminate at all. 

Consistency and progress depend on the usual canonical forms lemma, which 
is easy to prove thanks to the very restricted definitional equality. 
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Lemma 6.12 (Canonical forms). If v is a value and T h v :* r then r is a value 
type. Moreover, 

(a) If r = (a:* k) — >■ v then v is of the form Aa: $ K. e or K<5. 

(b) If t = Dip then v is of the form K S. 

(c) If r = * then v is a value type. 

Proof. By induction on the typing derivation. □ 
6.5.1 The definition of compatibility 

Given the reduction relation on types, obvious choices for a type equivalence 
relation include joinability or the equivalence closure of reduction. These en- 
sure that equivalent types have the same head constructors, so would guaran- 
tee consistency. However, they are too strong: for example, there is a coer- 
cion between (c : D (D x ~ D 2 )) — > Di and (c : D (D 1 ~ D 2 )) — > D 2 given by 
cong □ (Di ~ D 2 ) (Aci : D D x ~ D 2 , c 2 : D Di ~ D 2 , c' P c x ~ c 2 .ci), but these 
types are clearly not joinable if D 1 and D 2 are distinct constructors. 

Weirich et al. (2013) get round this problem by restricting the well-typed 
coercions so that they cannot use potentially inconsistent assumptions. This is 
necessarily over-restrictive, because there can be no decision procedure for con- 
sistency of a set of assumptions. A coercion between distinct types can exist 
in an inconsistent context, and this does not endanger consistency of the whole 
system. Instead, I define compatibility on closed types to ensure they have the 
same head constructors, and extend it to open types by considering closed in- 
stances. All types are equivalent in an inconsistent context, since there are no 
closed instances. Thus the existence of a coercion between two types can imply 
their compatibility. This novel approach works well for the evidence language, 
where types have a well-defined operational semantics; it would be interesting to 
see if it can be applied to System Fc with type families. 

The definitions and proofs in this section are rather technical, and can safely 
be skipped by the casual reader. The payoff comes in Subsection 6.5.4: the 
evidence language has the progress and type safety properties. I will present the 
structure of the argument here, and defer the details of proofs to Appendix D.4. 

I will define A k (ip) where if is a proposition and k is a natural number, to 
mean that ip cannot be falsified within k steps. A proposition 'really' holds if the 
relation holds for all k. This indexing ensures that the relation is well-founded, 
and facilitates proof by induction on the index, like a step-indexed logical relation. 
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Definition 6.1 (Computational, coerced and structural type expressions). A 
type expression is computational if it is a function application f(S) or a case ex- 
pression (d)case r of br\ l \ coerced if it is a coercion r>7; otherwise it is structural. 

Roughly speaking, Akij ~ v) means that if r and v are computational, they 
can both take a step and remain related, whereas if they are structural, they both 
have the same structure and the substructures are compatible. Any coercions are 
ignored (but must be between compatible types). Moreover, the kinds of the 
expressions must be compatible. 

Definition 6.2 (Compatibility). Define A k (tp) inductively on k , provided there 
exists 7 such that • h 7 : n tp. For such a 7, I write A k (7 : tp) to mean that 
Ak(ip) holds. The index k represents the depth of comparison to perform. Ao(<p) 
holds for any well-typed coercion. For k > 0, Ak(<p) is defined based on tp. 

If tp equates two computational expressions, their reducts must be compatible: 

• Afc((ri : Ki) ~ (r 2 : k 2 )) for T\ and r 2 computational if Afc_i(/ci ~ k 2 ), 
Ti — > v 1 , t 2 — > v 2 and A k _ 1 (v 1 ~ v 2 ). 

If tp equates two structural expressions, these must be the same structure and 
the subcomponents must be compatible: 

• Afc((H : k) ~ (H:k)) if A fc _i(/c ~ re); 

• A fc ((Ti*i>i : rei) ~ (t 2 *i> 2 : k 2 )) for $ 7^ □ if A fc _i(rei ~ re 2 ), A fc (ri ~ r 2 ) 
and A fc (ui ~ v 2 ); 

• A fc ((ri D ?7i:fi;i) ~ (r 2 D ^ 2 : re 2 )) if A fc _i(/ci ~ re 2 ), A fc (ri ~ r 2 ), A fc _ 1(771 : </?i) 
and A fc _ 1(772 : <p 2 ); 

• A fc (Ti ->• i>i ~ r 2 ->• i> 2 ) if A fc (ri ~ r 2 ) and A fc (i>i ~ i> 2 ); 

• Afc((ai : T «i) — >■ ri ~ (a 2 : T k 2 ) — )■ r 2 ) if Afc(/ci ~ re 2 ) and for all / < k, 
A,((i;i:ki) ~ (v 2 :re 2 )) implies Aj([ui/oi] ri ~ [v 2 /a 2 ]r 2 ). 

• A k ((d : D tpi) ->■ ri ~ (c 2 : D </? 2 ) ->■ r 2 ) if A fc (y?i ~ </? 2 ) and for all Z < fc, 
A,(7i : y?i) and A ; (7 2 : tp 2 ) imply Aj([7i/ Ci] r x ~ [72/ c 2 ] r 2 ). 

If one side is structural and the other is computational, the computational 
expression must reduce to a compatible structure: 

• Afc(Ti ~ r 2 ) where 17 is structural and r 2 is computational if r 2 — >■* t> 
where v is structural or coerced and A k (Ti ~ t>); 
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• Afc(ri ~ r 2 ) where Ti is computational and r 2 is structural if T\ — >* v 
where v is structural or coerced and A k (v ~ r 2 ). 

If either side is coerced, the coercion must be between compatible types and 
the underlying expressions must be compatible: 

• A fe (ri >r] ~ r 2 ) if A k (n ~ r 2 ) and A fc _i(r? : re x ~ re 2 ); 

• Akiji ~ r 2 >r]) where T\ is not coerced if A k {j\ ~ r 2 ) and A k - 1(77 : rei ~ re 2 ). 

Now the definition of compatibility is extended to quantified equations, by 
taking closed instances: 

• A k ((a: T re) — > (p) if for all / < k, Aj((r:re) ~ implies A/([r/a] </?); 

• A fc ((c: D <//) — )• </?) if for all / < k, Ai(r] : ip') implies Ai([r)/c\ </?); 

• A fc ((a:: A r) ->• </?) if A fe (</?). 

This definition extends naturally to closed telescoped coercions, requiring that 
all the coercions are between compatible types. 

Definition 6.3. Define A k {uj : A) where • h tc u : A by 

• A k (- : • ) always; 

• A k (uj, (t,v,j) : A, a : T re) if A fc (w : A) and A k (j : r ~ i>); 

• A k (u),(r),r)') : A, c : D </?) if A fc (u; : A) and both A k _i(r] : [tu/A]ip) and 
A k -i{-q' : [at/A] <^); 

• A fc (u;, (e, e') : A, x : A re) if A fc (w : A). 

For consistency and progress, the signature £ must not contain any inconsis- 
tent axioms or malformed types (as they would invalidate consistency), or any 
undefined runtime functions (as they would invalidate progress). These condi- 
tions are encapsulated in the following definition. 

Definition 6.4 (Good declaration and signature). A signature S is good if all 
the entries in E are good, where: 

• An axiom C : D (p is good if A k ((p) and A k ((p ~ </?) for all k. 

• A constructor H :* r is good if A k (r ~ r) for all k. 
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• A static function declaration / [A] : T k is good if it has a unique corre- 
sponding definition / [A] = r : T k in E, such that A k {u : A) implies 
A fc ([fc-/A]T~[atyA] r). 

• A dynamic function declaration / [A] :^ k is always good, since it cannot 
occur in types. 

From now on I will implicitly assume that the signature E is good. 



6.5.2 Properties of compatibility 

1 now prove that compatibility is a partial equivalence relation on types, that it 
respects computation and is a congruence. It is reflexive on well-typed expres- 
sions, but to prove this I must show that all well-typed coercions are compatible, 
which is the main result in the following section. 

Lemma 6.13 (Symmetry). If A k {r ~ v) then A k {v ~ r). 

Proof. By inversion on the definition. It is clear that every case is symmetric. □ 

Lemma 6.14 (Transitivity). If A k (r ~ v) and A k (v ~ k) then A k (r ~ k). 

Proof. By induction on k and inversion on A k ((p). For details, see Appendix D.4 
(page 250). □ 

In the usual step-indexed fashion, decreasing the step index preserves the 
relation, because strictly less of the expressions' structures are compared. 

Lemma 6.15 (Downward closure). 

(a) If A k + i((p) thenA k ((p). 

(b) If A k + 1 (to : A) then A k (u : A). 

Proof. Part (a) is by induction on k and inversion on A k + 1 (ip). Part (b) follows 
from part (a) by structural induction on u. □ 



To show that the step coercion preserves compatibility, use the following: 

mma 6.16 (Rec 
then A fc _ x(r ~ v). 

Proof. By inductio: 

Appendix D.4 (page 252). □ 



Lemma 6.16 (Reduction preserves compatibility). If t kpush > v and A k (r ~ r) 



Proof. By induction on k and the reduction step r kpush > v. For details, see 
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The definition of compatibility makes it a congruence for structural expres- 
sions and coercions. I must prove that it is a congruence for case analysis. 

Lemma 6.17 (Congruence for case analysis). If A k (e ~ e') and A k (br { xs 6r'-) 
for all i, then A /t ((d)casee of 6r/ ~ (d)casee'of br'^). 

Proof. By induction on k and case analysis on e and e', using Lemma 6.16. For 
details, see Appendix D.4 (page 255). □ 

To show compatibility of the kind 7 coercion, which extracts a proof that the 
kinds are equal from a proof that two types are equal, I will need the following: 

Lemma 6.18 (Compatibility of kinds). If A fc ((Ti : k±) ~ (r 2 : k 2 )) holds, then 
A fc _i((«i:*) ~ (k 2 :*))- 

Proof. By induction on k and inversion on A k (<f). □ 

6.5.3 Well-typed coercions are compatible 

Finally, I can show that the existence of a coercion between types implies their 
compatibility. Consistency is then an immediate corollary. Crucially, the logical 
unsoundness of the type language, due to the presence of general recursion and 
the paradoxical * : *, does not affect the □ fragment. General recursion is not 
available in coercions, and they may perform only a finite amount of computation. 

Lemma 6.19 (Basic Lemma). 

(a) IfT\~T : v k then for all k, A k (cu 0 : T) implies A k ([tJo/T] r ~ [^o/r] r). 

(b) If T h br : v v ► r or T h br : v (e : v) ► r then for all k, A k (u 0 : T) implies 
A fc ([^/r] br w [tf 0 /T] br). 

(c) IfT h 7 : D if then for all k, A k (uj 0 : T) implies A k ([tu^/F] <f) and A k ([u^/T] f) 

(d) J/rP c u : A then for all k, A k (ou 0 : T) implies A k ([oo 0 /r]uj : A). 

Proof. By structural induction on typing derivations. Note that k is quantified 
inside the inductive hypothesis. For details, see Appendix D.4 (page 256). □ 

Theorem 6.20 (Consistency). If - h 7 : D (£o : *) ~ (£1 : *) then and ^ have 
the same head constructor (that is, either = (a,- :*/Cj) — >■ r f or £j = H ipi). 

Proof. This follows from the special case of Lemma 6.19(c) where V is empty, since 
Ai(£i ~ £2) implies £1 and £2 have the same head constructor, by definition. □ 
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6.5.4 Progress 

Thanks to the consistency proof, progress is straightforward, as in previous work. 
Type safety is an immediate corollary. 

Theorem 6.21 (Progress). If • h e ^ r then either e is a value, e is a coerced 
value or there is some e' such that e — > e' . 

Proof. By structural induction on the typing derivation. When a coerced value 
prevents reduction, Theorem 6.20 ensures that the relevant push rule applies. □ 

Corollary 6.22 (Type safety). If • h e :^ r and e — >* e' then either e' is a 
value, e' is a coerced value or there is some e" such that e' — > e" . 

6.6 Erasure 

The erasure operation, defined in Figure 6.14, produces a runtime version ||e|| 
of an evidence term e (phase X) by removing static subterms (phases V and □). 
Similarly, an erasure operation \\S : A|| is defined for a vector 5 in telescope A. 

Runtime terms r are a subgrammar of evidence terms e, except that A- 
abstractions do not record their types and an additional marker _ is used to 
indicate where a subterm has been erased. This could be implemented with a 
unit type. No casts are present in runtime terms, and dependent case analysis has 
been turned into normal case analysis. The grammar of runtime terms is: 

r ::= a | Xx.r \rr' \ K \ f{jk k ) | case r of K, A,- — > r,-* | _ 

Erasure uses the phase annotations on applications to avoid reconstructing 
the type of the term, but if they were not present it would be easy to define 
erasure in a typed fashion, since the evidence term encodes its typing derivation. 
Saturated function applications f(S) are not annotated with phases, so erasure 
for vectors \\5 : A|| uses the telescope A from the declaration of the function. 

The operational semantics of runtime terms is given in Figure 6.15. It is a 
subset of the rules from Figure 6.13, omitting those related to coercions, and 
erasing the bodies of functions defined in the signature. 

The motivation for replacing erased subterms with the _ marker, rather than 
removing them (and the corresponding A-abstractions) altogether, 10 is that it sim- 
plifies the correspondence between the original and erased operational semantics. 
This correspondence is shown by the following lemma. 

10 Of course, subsequent optimisation of runtime terms could remove the unnecessary redexes. 
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Lemma 6.23. If ■ h e ^ r, then either 



• e is a coerced value and ||e|| is a value; or 



• e 



e' and either II e 




or e 





Proof. If e is a coerced value, it is easy to see that ||e|| is a value. If not, Theo- 

rem 6.21 means that e can take a step to some e'; proceed by induction on the 
step taken. For most steps, the result follows immediately by induction or the 
fact that both e and e' are identical after erasure. The cases for /3-reduction and 
definitional expansion make use of the fact that erasure commutes with substi- 
tution, i.e. ||[5/A] e|| = [\\5 : A||/A]||e||. When the scrutinee of a case expression 
takes a push step, this does not change its erasure. □ 

The erasure operation described above removes all static information from 
evidence terms. In some cases it is also possible to erase dependencies without 
erasing types entirely: datatype indices are removed and Il-types become non- 
dependent functions. For example, in inch syntax, 

data Vec :: * — > N — > * where 
Nil :: Vec a Zero 

Cons :: a — > Vec a n — > Vec a (Sue n) 

would be erased to 

data Vec :: * — > * where 
Nil ::Veca 

Cons :: a — > Vec a — > Vec a 
otherwise known as the type of lists, and 

replicate :: II (n :: N) — > a — > Vec a n 
would be erased to 

replicate :: N — > a — > Vec a 

Thus an inch term can sometimes be converted into a Haskell term, or an evi- 
dence term can be converted into a System Fc-like term. However, this is not 
possible for terms containing large eliminations, where a type is computed from 
a shared term by type-level case analysis. This approach is used in the prototype 
implementation, as discussed in Chapter 8. 
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6.7 Discussion 



I conclude this chapter with comments on possible extensions to the evidence 
language, and a comparison to its predecessors. In the following chapter, I will 
show how high-level inch source code can be translated to the evidence language 
discussed in this chapter, by a process of elaboration. 

6.7.1 Representing numbers 

So far I have said a great deal about how to manage H-types, and indeed phases 
more generally, but I have not said much about numbers. How might the evidence 
language described here be extended to support them? 

One option is to adopt the traditional algebraic datatype presentation of nat- 
ural numbers and integers: 

data N = Zero | Sue N 

data Z = Non Negative N | StrictlyNegative N 

Mathematical operations such as addition can be defined on these representations 
as pattern-matching functions, and the machinery in this chapter will allow them 
to be used on the type level. This is rather inefficient, though perhaps the com- 
piler could replace the representation with a native version after typechecking. 

However, the equational theory desired for these operations is more than the 
behaviour delivered by computation. By adding axioms to the signature, prop- 
erties such as the commutativity of addition can be made available as coercions, 
and hence used by the elaborator. Consistency of the system, and hence type 
safety, are ensured provided the conditions of Definition 6.4 are satisfied. 

In particular, any new axioms must be compatible, i.e. true on closed in- 
stances. The commutativity of addition axiom 

(a : V N, b : v N) ->• (a + b) ~ (b + a) 

is fine, because (a + b) ~ (b + a) holds by computation whenever a and b are 
replaced with closed values, but a bogus axiom such as 

(a: v N) ->• a ~ Sue a 

will not be compatible. 

One problem with this approach is that some valid axioms do not satisfy the 
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compatibility relation, because they change termination properties. For example, 

(a: v Z) — > (a — a) ~ Zero 

is not accepted, because if a is instantiated with a closed divergent term, then the 
left-hand side diverges but the right does not. This could be resolved by extending 
the definition of compatibility, so that rather than considering reduction alone, 
numeric expressions could be simplified via axioms. Consistency would then 
depend on a global property of the axioms, that they could not be used to derive 
Zero ~ Sue Zero. 

There is also more work to do on the evidence for inequality constraints. These 
can be encoded using algebraic datatypes, for example 

data m ^ n where 
Z :: Zero ^ n 

S :: m ^ n — > Sue m ^ Sue n 

but it might be preferable to make use of the □ fragment to record known- 
consistent (and hence erasable) inequality proofs, just as coercions are known- 
consistent equality proofs. 

6.7.2 Adding 77-laws 

Another desirable extension of the compatibility relation is support for ^-conversion 
of single-constructor (record) datatypes. For example, the usual fst and snd pro- 
jections from pairs are perfectly good shared definitions, so they can be used at 
the type level. It would be useful to support the 77-axiom 

(a : v *, b : v *, x : v (a ,&))—>■ x ~ (fst(x) , snd(x)) 

which says that any inhabitant of a pair type is equal to the pair of its projections. 
For example, this is needed to show that the type of paths in a binary relation 

data Path :: ((a, a) — > *) — > ((a, a) — > *) where 
Stop :: Path r (x, x) 

Step :: r (x, y) — > Path r (y, z) — > Path r (x, z) 

forms an indexed monad. The following definition is accepted 

returnlx :: r (x, y) — > Path r (x, y) 
returnlx v = Step v Stop 
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but its type is insufficiently general; it should have the type 

returnlx :: r c —?■ Path r c 

which requires 77-expansion. 

As in the previous section, ^-axioms change termination properties, because 
x might diverge, so they do not satisfy the existing definition of compatibility. 
However, as with numeric axioms, compatibility could be modified to build in 77- 
expansion, by defining A/ £ ((r , r') ~ v) for computational expressions v to mean 
A k (r ~ fst(-u)) and A fc (V ~ snd(-u)). 

6.7.3 Related work 

System F C (X) was introduced by Sulzmann et al. (2007) as a new core lan- 
guage for GHC. It is based on System F, the second-order polymorphic A-calculus 
(Reynolds, 1974; Girard et al., 1989), but adds algebraic datatypes, higher kinds 
and explicit coercions (proofs of type equality). It was motivated by the need to 
elaborate GADTs and type families, both of which can be understood as exten- 
sions to the equational theory of types: case analysis on GADTs introduces new 
equational hypotheses, which may be used to show the body is well-typed, while 
type families add axiomatically-defined type-level functions. This was a major 
advance on the previous approach used in GHC, of adding GADTs to System F 
directly. The (X) parameterisation in the system represents its dependence on 
an unspecified decision procedure for checking that a context is consistent, i.e. 
that the axioms and equational hypotheses do not entail a contradiction. The 
system was subsequently revised by the authors in the light of implementation 
experience (Sulzmann et al., 2009). 

Weirich et al. (2011a) developed System F C 2 to rectify a consistency problem 
discovered in the implementation of GHC. This resulted from the combination 
of newtypes, which introduce axioms asserting their equality with the underlying 
representation type, and type families, which can distinguish between a newtype 
and its representation. They proved that their system is consistent if type family 
declarations are non-overlapping, using an approach based on rewriting. 

Development continued with System F<t (Yorgey et al., 2012), which adds 
datatype promotion and kind polymorphism. This allows algebraic datatypes 
to be used as kinds, so type-level programming need not be entirely untyped: 
for example, a datatype of Peano numerals can be promoted to the kind level 
and used to index a GADT of vectors. However, kind equality in F<t is purely 
syntactic, so it is not possible to promote GADTs. Vytiniotis et al. (2012) tweaked 
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System to support deferred type errors, by distinguishing between an 'unlifted' 
type of known-good equality proofs and a 'lifted' type of potentially bogus proofs 
that must be evaluated before use. 

Weirich et al. (2013) took the datatype promotion and kind polymorphism 
ideas to their logical conclusion, by eliminating the distinction between types 
and kinds. The evidence language described in this chapter continues in this 
direction, as it makes no distinction between types and kinds. It goes further 
in that terms and types share a common syntax and typing rules, though the 
phase restrictions mean not every expression form is available at every phase. 
Moreover, it adds H-types, allowing real dependency without the need for the 
singleton construction. 

6.7.4 Future work 

A key idea of this chapter is the use of a common syntax for terms, types and 
kinds, while the phase distinction is maintained by indexing typing judgments 
with the phase at which they apply. Variables in the context carry a phase, 
and application allows for promotion implicitly, as described in Section 6.2. An 
ordering on phases makes it possible for data at one phase to be used at another, 
thereby streamlining the presentation of H-types. 

Phases need not be confined to this system, however: they can be defined 
for any Pure Type System. The set of phases need not be {V, □, n, X} as m 
this chapter, but could be any partially ordered set with a suitable relativisation 
operator $ / \l>. For example, a system with two phases could model a dependent 
type theory that distinguishes between runtime and compile-time data. The 
results of this chapter illustrate the properties required for a system of phases. 
Work is ongoing to develop the theory of phases and investigate its applications. 

The novel consistency proof for coercions given in Section 6.5 takes a different 
approach to previous work, and thereby lifts a technical restriction on the use 
of potentially inconsistent assumptions in coercions between D-quantified types. 
However, this approach relies on the common operational semantics for types 
and terms, and in particular the treatment of type functions via case analysis. It 
remains to be seen whether the method can be extended to support the notion 
of type families in System F c , which are defined axiomatically. 
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Chapter 7 



Producing the evidence: 
elaborating inch 

Broadly construed, elaboration is a type-directed process of translating a high- 
level source language into a more explicit intermediate language, inferring details 
that were originally left implicit. Section 2.4 (page 27) showed how to elaborate 
A-calculus with let-expressions into explicitly-typed System F. GHC elaborates 
Haskell programs into System Fc, which adds algebraic datatypes, higher kinds 
and type equality constraints to System F. Dependently typed languages such 
as Epigram are explained by elaboration into a type theory, with the elaborator 
synthesising implicit arguments and solving higher-order unification problems. 

Following the Curry-Howard correspondence, elaboration of programming lan- 
guages is closely connected to generating proof objects from proof scripts in in- 
teractive theorem provers (such as Coq with its core language Gallina). Here, 
the primary motivation is ensuring correctness through the de Bruijn criterion. 
A well-understood kernel theory, with simple typechecking, allows the output 
from complex tactics and decision procedures to be independently rechecked. 
Likewise, GHC is an extremely complex program, and the ability to easily type- 
check programs in the intermediate language is crucial to debugging the compiler. 
Moreover, the intermediate language provides a good basis for implementing op- 
timisations, as all the typing information is available explicitly. 

In this chapter, I will describe the process for elaborating inch programs into 
the evidence language defined in the last chapter. I begin by introducing 'type 
schemes', which decorate evidence language types with information on implicit 
arguments (7.1), inspired by the work of Pollack (1990). The formal syntax 
of the inch language (7.2) includes a large fragment of the informally presented 
syntax. Instead of giving this a type system directly, I supply a non-deterministic 



elaboration system that relates inch terms to evidence terms (7.3). 

I then explain how partial knowledge and progress can be represented (7.4), 
and describe a definite (and necessarily incomplete) algorithm for elaboration 
(7.5). This is based on the work on type inference in Part I, where unification 
variables and unsolved constraints are explicitly represented using metacontexts. 
The algorithm reduces elaboration to constraint solving in the underlying evi- 
dence language. Designing a constraint solving algorithm is a complex task in 
itself. I will specify its required properties and describe it at a high level, but I 
will not describe constraint solving in detail. 

Elaboration of case expressions, which is the basis for the treatment of pattern- 
matching definitions, is somewhat involved and is therefore postponed (7.6). The 
chapter concludes with some contextualising remarks (7.7). 



7.1 Type schemes 

As discussed in Subsection 5.2.4 (page 99), it is desirable to have finer-grained 
control over which arguments are automatically inferred than the current Haskell 
policy of forcing V-bound arguments to be implicit and other arguments to be 
explicit. Instead, constants and variables in the context will be assigned a type 
scheme a, consisting of a quantified type with annotations indicating whether 
each argument is implicit (:«) or explicit (: e ). The grammar of schemes is given 
in Figure 7.1. For example, the type scheme of the equality constructor is 

(~):(a :?*, b ^*,x :£ a, y :£ 6) ->■ * 

meaning that the first two arguments are implicit and the last two are explicit, 
thereby justifying the usual use of ( ~ ) as a binary operator. The usual definition 
of vectors gives rise to the following type schemes: 



Vec : 


(a 


n 


.V 
"e 


N) 


->■ * 


Nil : 


(a :**, 


n 


.V 
m i 


N, 


c :° n ~ Zero) — > Vec a n 


Cons : 


(a :**, 


n 


.V 
'i 


N, 


m £ N, 




x :£ 


a, 


XS 


■A 


Vec a to, c n ~ Sue to) 



I will not give formal rules for elaborating source language datatype declarations 
into constructors with the appropriate type schemes. 

Quantification over proofs (at phase □) will always be implicit, because co- 
ercions are not written in the source language. On the other hand, dynamically 
quantified variables will be explicit, as they cannot be determined by unification 
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a ::= r | (a :* a') — > a | (a :f r) — > a 
r, A ::= • | T,a:fa \ T,a:fr 



L(x: 



a 



[(a :f r) -+ a\ 



r 



(x:*L^J)^ W 



L-J H- • 

:? a, AJ h. x :• LaJ , LAJ 
La:fr,AJ h> a :* r, LAJ 



Figure 7.1: Grammar and erasure of schemes and annotated telescopes 

constraints. Typeclasses can be seen as a form of implicit dynamic quantification, 
with an alternative strategy for finding the corresponding arguments, based on 
instance search rather than unification. This idea underlies Agda's support for 
instance arguments (Devriese and Piessens, 2011). I will not consider typeclasses 
further, but it is straightforward to handle them using the elaboration framework. 

Like schemes, telescopes can be annotated to indicate whether the argument 
is implicit or explicit, writing A instead of A. Erasing the annotations produces 
a type or telescope in the evidence language, written |_crj or [AJ and defined in 
Figure 7.1. I will assume that the signature S assigns type schemes to constructors 
H and annotated telescopes to shared functions /. In general, I will elide the 
distinction between a quantified type (a :* k) — > r and an explicitly-quantified 
type scheme (a :* k) — > t. 

Quantifying a scheme over a telescope (A) — > o and the relativisation operator 
A II $ extend the definitions on evidence expressions in the obvious way. 

I do not extend the type system of the evidence language itself. This avoids 
complicating the metatheory with details of implicit arguments. Rather, schemes 
are a tool for explaining how elaboration should generate explicit evidence terms. 

To obtain good inference behaviour, the elaboration algorithm should never 
attempt to 'guess' type schemes, only propagate them through bidirectional type 
inference. This avoids questions of how to unify type schemes. For this reason, 
the domain of an implicit quantification is always a type rather than a scheme. 

Following the Agda convention, the application syntax p{a = p'} is used to 
supply explicitly an argument that is usually implicit, with name a. This means 
type schemes cannot always be treated as equivalent up to a-conversion, as names 
may appear outside the scope in which they are bound. 
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p 



pp' 

p{a = p'} 
V(a: k) — y x 
U(a:x) ->■ u 

T — > "U 



p:ff 
Ax . p 

let x = p in p' 



H 



mc/i expression 
variable 

explicit application 

implicit application 

explicit V quantification 

explicit II quantification 

function type (explicit J\ quantification) 

constructor 

saturated function 

type ascription 

abstraction 

let binding 

unknown 



Figure 7.2: Grammar of inch expressions 



7.2 



Formal syntax of inch 



The grammar of inch is presented in Figures 7.2 and 7.3. Like the evidence 
language, there is a common syntax for expressions p, but I will usually use 
t, v or k for types and s or t for runtime terms (according to the respective 
subgrammars) . While the presentation using a common syntax is compact, it is 
inessential and one may use different syntaxes for the term and type levels. 

The main additions, compared to the evidence language, are: let-expressions; 
the ability to ascribe a type scheme to an expression, written p : cr; and the 
'unknown' marker _, which asks for a value to be inferred by the elaborator. 
All coercion proofs are omitted (as they will be generated by constraint solving, 
not supplied by the user). The inch syntax uses upright Greek letters such as p, 
where the evidence syntax would use the italic p. 

The syntax of inch type schemes cr is deliberately chosen to resemble Haskell 
syntax. It will be translated by elaboration into the evidence language type 
schemes of Section 7.1. There is no explicit quantifier at phase □, and the implicit 
quantifier does not bind a variable, because proofs are invisible in the source 
language. There is no implicit quantifier at phase X, because no constraints would 
be able to determine the value of a dynamic argument (absent typeclasses). 

The source syntax should allow type ascriptions on quantifiers to be omitted, 
but this can be dealt with by inserting _ markers as necessary. For example, the 
universal quantifier V a.cr can be desugared into V(a : _). cr before being elaborated 
into the evidence type scheme (a \( k) — > a. 

The treatment of (dependent) case analysis is postponed to Section 7.6. 



147 



inch type scheme 

V(a:K).cr implicit V quantification 

V(a: k) — >■ a explicit V quantification 

II(a:T) — > cr explicit II quantification 

n(a:t). 0 implicit II quantification 

T =^ cr constraint (implicit □ quantification) 

& — > cr function type (explicit J\ quantification) 

T type 

inch type 
a variable 
xv explicit application 

x{a = v} implicit application 

V(a: k) — >■ T explicit V quantification 
n(a:x) — )• u explicit IT quantification 
t — > v function type (explicit J\ quantification) 

rigid constructor 
saturated function 
type ascription 
_ unknown 

inch term 
a variable 
t p explicit application 

t{a = p} implicit application 

K data constructor 

/(6) saturated function 

t : ff type ascription 

Ax . t abstraction 
let x = s in t let binding 

unknown 



H 

m 

x: a 



6, p | 5,{a = p} 



Figure 7.3: Grammar of inch type schemes, types, terms and vectors 
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7.3 Non-deterministic elaboration 



I start by giving a non-deterministic presentation of elaboration that relates inch 
syntax to well-typed evidence terms, following the account of elaboration for 
implicit argument synthesis in the Calculus of Constructions by Luther (2003). 
The non-deterministic presentation resembles a type system for inch, as it allows 
types and evidence terms to be assigned, but does not indicate how they are to 
be discovered. I will then show how missing information can be reconstructed 
and give a deterministic algorithm. 

The non-deterministic elaboration rules are presented in the Figures 7.4-7.6. 
Intuitively, elaboration is built out of structural rules, which preserve the struc- 
ture of the input term, and wrapping rules, which add information missing from 
the input. It is a kind of 'embedding' of inch terms into evidence terms. The 
judgments defined are: 

• r h p ~* p :* a, meaning that the inch expression p can be elaborated into 
the evidence expression p with scheme a; 

• r h 6 -w S : A, meaning that the inch vector 5 elaborates to S in the 
telescope A, by inserting implicit arguments; 

• r h cr -w cr, meaning that the inch type scheme a elaborates to the 
evidence type scheme a; and 

• r h e : a -< e' : a', meaning that the type scheme a is subsumed by a' and 
if e : a then e' : a'. 

7.3.1 Non-deterministic elaboration of expressions 

The judgment T h p -w p :* a, defined in Figure 7.4, means that in a context T, 
the inch expression p interpreted at phase $ can be elaborated to the evidence 
term p with type scheme a. This judgment is not defined at phase □ because 
coercions do not appear in the source language. It uses annotated contexts T so 
that variables record whether they were explicitly bound, and hence in scope for 
the source language, or implicitly bound, and hence inaccessible. 

Most of the elaboration rules simply preserve the structure of the source lan- 
guage expression in the target language. An important exception is the 'magic' 
rule for implicit A-abstraction 

T, a :f t h t -w e : A a 
rht^Aa: $ r.e: A (a:fr)^(7 
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rh P 



~* p : a 



(p can elaborate to p with scheme a at phase ^ G {W,U,J\}) 



[T\ h ctx 

T 3 a :f a 



r h a a :* a 



E3/[A]:*(t 



T h /(6) - :* [5/ A] « 

T h K ^ K : V * 
r,fl:^KhT^T: v * 



[rj h ctx 

rhH^H:*(7 

rhpwp:' 1 ' (A) ->■ a 

rhp6-»p5:* [5/ A] o 



T, x r h u -w v : 



.v 



rhV(a:K)4T-> (a: v /t) ->• r : v * T h n(x:T) ->■ v ~» (x: n r) ->• i> : v * 



r h v ^ d * 



.V 



[rj h ctx 

r h * -w * : v * 



[rj h ctx 



r h (~) <w (~) : v ( a : y *) ->. (6 £ *) ->. a ->. & ^ * 
r,j::*rhtwe: A (T r, a :f r h t e : A a 



rhAi.twAx: # r.e: A (s:*r)^(j r h t <w Aa: $ r . e : A (a :f r) ^ a 



r h s -w e : A er 
r,i: A ffht^e': A (i' 



T h let:r = sint (Ax: [crj . e) e : A «/ 



r h ff -w cr 
T h p -w p :* a 
T h (p: a) -w p :* a 



[Tj h p :* r r h p p :* r [rj h 7 : D r ~ u 

rh_^»/):*T rhp-^/)t>7:*D 



rht->e: A (i 



T h e : a < e' : a 1 



Figure 7.4: Non-deterministic elaboration of expressions 
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This rule inserts an abstraction based on the type scheme, leaving the term t 
unchanged. The variable is implicitly bound in the context, so it cannot be used 
in the source language. Similarly, the rules for _ markers and conversion 



invent evidence out of thin air. This shows the non-determinism of the system. 

Applications are elaborated using the judgment for vectors of arguments, dis- 
cussed below. For example, if / : Bool — > Int then the source term map / will be 
elaborated using the telescope (a b (a — > b),x [a]) — > [b] of map, 

inserting the two implicit arguments to produce map Bool Int /. Note that the 
vector may be empty, allowing constants (e.g. Nil) to take implicit arguments. 
Applications need not be saturated, except for applications of shared functions. 

Non-deterministic elaboration of vectors 

The judgment T h 5 -w 5 : A, defined in Figure 7.5, means that the vector 6 can 
be elaborated to 5 in the annotated telescope A. This inserts implicit arguments: 
for example, the source vector containing the single entry Bool can be elaborated 
in the telescope a \f *,b a to the two-element vector * , Bool where the first 
component has been inserted. Usually-implicit arguments may also have been 
explicitly specified by the user: for example, the source vector {a = Z},3 can be 
elaborated in the telescope a \ f *, b :^ a to Z , 3. 

Non-deterministic elaboration of type schemes 

The judgment T h a ~* a, also defined in Figure 7.5, means that the inch 
type scheme a can be elaborated to a. This is entirely structural; the only 
interesting behaviour is when elaborating types. Elaborating the codomain of a 
type scheme always takes place with the domain variable bound explicitly, even if 
the quantification is implicit, since the variable is still in scope for the codomain. 
As an example, the type scheme for replicate 

Va :: * . II n :: N — > a — > Vec a n 

can be elaborated to the evidence type scheme 



[r]l-p:*T 



[TJ h 7 : n r ~ v 



rhp^»pl>7: v 




N, x a) ->■ Vec a n . 
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r h 6 ~> 5 : A ( vector 6 can elaborate to 5 in telescope A ) 

rhp-^p:^ Th6^5: [p/x] A 

rh-w- : . r h p,5 ->p,<5 : x :f a, A 

r h 6 ~* 5 : [p/a] A r h 6 ~» 5 : [p/a] A 



r h {a = p},6 ~» p,5:a:f k,A T h 5 ~» p, 5 : a :f k, A 



r h a a 



(scheme cr can elaborate to a) 



rhr^r: v > 
T h T *^ T 



rh<-*K: v * 



r. a :^ K h (J cr 



T h K ^ K : * 
.V 
'e 



r, a ig « h cr -w cr 



rhV(a:K).cr -w (a : V «) ->■ cr 
rhT-»r: v * 



r h V(a: k) -»• a -w ( a k) ->■ cr r h IT( 



r h T ~* t : v * 
r.i r h ff cr 



r h T ~* <p : N 
r, c :° <p h cr ~^ a 



T h II(x:t). a -w (x :f r) ->■ cr T h t =>• cr ~> (c <p) ->■ cr 

Thcr'^cr' rhcr^cr 
r h cr 7 ->• cr ~» (x :£ cr') -»• cr 



Figure 7.5: Non-deterministic elaboration of vectors and type schemes 
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e: a ^ e' : a' 



(scheme a is subsumed by a' , converting e to e' ) 



\T\ h e : A H Lrj I- 7 : D r ~ v 



r h e : a - -< e : a T \- e : t ~< e\> ^ : v 



T,y:$o> 0 hy:o> Q ^e' :a Q T,y : A ^ h e g : a x ■< e" : ^ 

r h e : (x : A <r 0 ) ^ -< A?/: A KJ ■ e" : (y : A </ 0 ) ^ 

[rj h 7 : D i; ~ r r, & : J v h e (6 > 7) : [b > 7/0] <r -< e' : <r' 



rhe:(a:Jr)^^ A&: T i;. e : (6 :J v) ->• a 




r, a :f r h e : a -< e' : cr' 



rhe:(fl:fr)4Me':a' 



r h e : a -< Aa:*r. e' : (a :f r) a' 



Figure 7.6: Non-deterministic subsumption 



7.3.2 Subsumption 

Programs involving higher-rank types may require the elaborator to do more than 
insert implicit arguments in order to assign the right type. For example, if 

x :: W a . Bool — > a 

y:: (V6.(Vc.c) -> b) -> Bool 

then the application y x should be well-typed. The elaborator must check that 
x has the scheme V b . (V c . c) — >■ 6, which is more specific than V a . Bool — > a 
thanks to the contravariance in the domain, as Bool is more specific than Vc. c. 
The conversion rule for terms 



invokes the subsumption judgment T h e : a -< e' : a' to verify that a' is more 
general than a. This judgment, defined in Figure 7.6, constructs e' corresponding 
to e but with appropriate (implicit) abstractions and applications so that it has 
type scheme a' rather than a. 

In the example given above, e = x with scheme a = (a \f *, z : A Bool) — > a 
and a' = (b \f *, z : A ((c \( *) — )• c)) — )• 6. The variable & is bound in the context, 
so it can be substituted for a. Then both schemes are explicit ^-quantifications, 
so the contravariance rule applies and checks that (c \( *) — >■ c is below Bool. 



T h t -w e : A o- 



rhe:Me':ff' 



r h t -w e' 
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In turn, this instantiates c with Bool and applies reflexivity. Having checked the 
domains, the contravariance rule checks the codomains, which are identical. The 
resulting evidence term is y (Afr: v * . \z: ((c: v *) — y c) . x b (z Bool)). 

Since subsumption involves inserting implicit A-abstractions, it is only avail- 
able for terms (at phase X)- ft i s n °t possible at a static phase T because there 
is no type-level A-abstraction. Instead, the conversion rule for types 

rhp^p:*r L r J I" 7 : ° r ~ v 

rhp^/)t>7:*t) 

can only appeal to a proof of type equality. This restricts the utility of higher-rank 
definitions at the type level. 

7.3.3 Soundness of non-deterministic elaboration 

Obviously, the non-deterministic system should be sound in the sense that the 
resulting evidence expression is actually well-typed. 

Theorem 7.1 (Soundness of non-deterministic elaboration). 

(a) //rhp^p:*a/orM/G{V,n,A}, then [T\ h p :* [a\. 

(b) If TV- a ~* a then [T\ h [^J : V *. 

(c) If T h 5 ~» 5 : A then [T\ h 5 : [AJ . 

(d) If TV- e:o<e' : a' and [T\ h e : A [a\ then [T\ h e' ;A [ff'J . 

Proof. Straightforward structural induction on derivations. □ 

While this system provides a helpful starting point, it does not define an 
algorithm. The same syntax can be translated in many different ways depending 
on the placement of implicit applications and quantifications. For example, the 
inch term Xx.Xy.x y could be translated to Aa: v * . Afr: v * . \x: (a — > b) . Xy.a . xy 
or Kb : v * . \x : ((a : v *) — > a — > b) .Xy: Bool . x Bool y or many other mutually- 
incompatible evidence terms, with no principal or canonical choice. Even if the 
type scheme is known, there are many unspecified choices. 

To describe the deterministic algorithm, I must first extend the type system 
to support metavariables, which will stand for the unknown types and proof 
obligations (constraints) that arise during elaboration. 
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7.4 Metavariables and information increase 



Just as in the unification and type inference algorithms of Part I, a metacontext 

0 contains declarations of metavariables to represent unknowns that arise during 
elaboration. This includes types, represented by metavariables a and (3, and 
coercion proofs (. Each metavariable has a telescope A of parameters, and a 
kind k that may depend on A. 

Metacontexts may also bind variables. Like the annotated contexts of Sec- 
tion 7.1, these record whether the binding is implicit or explicit. Source language 
programs may refer only to explicitly bound variables. 

The grammar of metacontexts is given by 

0 ::= metacontext 

empty 

0, a [A] :* k unknown metavariable 

0, a [A] = p :* k defined metavariable 
0, a :f a explicitly-bound variable 

0, a :f t implicitly-bound variable 

1 will use S for a metacontext that contains only metavariables; the telescopes T, 
A are metacontexts that contain only variables. 

As in previous chapters, metacontexts are ordered by dependency. Figure 7.7 
gives the rules for a valid metacontext, generalising the judgment T h ctx defined 
in Figure 6.5 (page 118). This ensures that metavariables are defined uniquely 
and that their types are well-kinded. The sanity condition (Lemma 6.9, page 127) 
continues to hold: if 0 h mctx then £ h sig. The typing rules in the previous 
chapter are generalised by replacing T with 0 and T h ctx with 0 h mctx. 

The syntax of evidence expressions is extended with a new form for 
metavariable occurrences, where 5 is a vector in A. I add a typing rule for 
metavariables to the rules in Figure 6.6 (page 119): 

0 3 a [A] :* k r h 5 : A 

T h a [5] :* [S/A] K 



Metasubstitutions 

A metasubstitution 6 : ©o E ©i gives values for metavariables in the metacontext 
0 O in terms of the metacontext 0i. Since metavariables have parameters, each 
component of a metasubstitution takes the form A.p / a where A is the telescope 
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0 h mctx 



Shsig a#0 0,Ah/t: v * a#6 6,Ahp:*/t 

• h mctx 6, a [A] k h mctx 0, a; [A] = p k; h mctx 

a#9 0 h [crj : v * $ ^ □ a#0 0 h r : v * $ ^ □ 

6, a :* a h mctx 0, a :f r h mctx 

c#0 0 h y? : v * 
6, c :° </? h mctx 



Figure 7.7: Validity of metacontexts 



0 : e 0 E Qi 

0 : ©o E ©i S^Ahp :* 0 k 
• : • E S {0,A.p/a) : 9 0 , a [A] :* /t E ©i 

0:©oE©i e!,6A\- p = 9p' :* 9 k 6 : 0 O E ©i 

(0,A.p/a) : ©o,«[A] =p' :* k E 6i 0 : 0 O , a J <r E ©i, a 0(7,2 

0 : @o E @i 
0: 6o,a:fr E6i, a :f0r,S 



Figure 7.8: Metasubstitutions 
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for a, and binds variables in p. Valid metasubstitutions are defined in Figure 7.8. 
The rules ensure that metasubstitutions preserve the structure of variables in the 
metacontexts, as in Subsection 2.1.2 (page 14). 

Metasubstitutions act on syntax by the structural closure of 

9 (a [<5] ) i->- [S/A] t where 9 contains A.r / a. 

The identity metasubstitution i is defined in the usual way, replacing each 
metavariable with itself. I write Go E 6i where the information increase is by 
the identity metasubstitution. 

Lemma 7.2 (Metasubstitution). If 9 : 6 0 E @i and 0 O \~ J, then @i h 9 J. 
Proof. By induction on derivations. □ 

7.5 Deterministic elaboration 

The deterministic elaboration algorithm is built from the non-deterministic re- 
lation by attaching input and output metavariable contexts, allowing missing 
information to be replaced with metavariables. It is defined in a bidirectional 
style, based on the following judgments defined in Figures 7.9 and 7.10: 

• ©o I - * P -w sch p : a H ©i, meaning that p elaborates at phase ^ to p with 
assigned type scheme a; 

• ©o I - * p ~* p : t H ©i, meaning that p elaborates at phase ^ to p with 
inferred type r; and 

• @o I - * p : er -w p H © 1; meaning that elaborating p with the type scheme a 
at phase \1/ produces the evidence term p. 

The following auxiliary judgments are defined in Figures 7.11-7.13: 

• ©o \~ c ~* er H ©i, meaning that the type scheme a elaborates to a; 

• 6 0 l-' f (p : a) 6 ~* p' : r H ©i, meaning that elaborating the spine of 
arguments 6 applied to the elaborated head p : a results in p' : r; 

• 8o I - 6 : A ^ 5 H 0i, meaning that elaborating the components of the 
vector 5 in the telescope A results in 5; and 

• ©o \~ e : a -< a' ~» e' H ©i, meaning that the scheme a is subsumed by a' 
and if e has scheme a then e' is the corresponding term with scheme a'. 
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Finally, the judgment Q 0 \-r~v^'-f-\ ©i, means that r and v are unified, 
witnessed by the coercion 7. This is an invocation of the constraint solver, which 
I do not specify in detail. I discuss this further in Subsection 7.5.1. 

For all these judgments, the parameters before the arrow -w are inputs, and 
they determine the outputs (which appear after the arrow). In general, informa- 
tion flows clockwise through each inference rule, with the inputs to the conclusion 
determining the inputs to the first premise, whose outputs determine the inputs to 
the next premise, and so forth, until the outputs from all the premises determine 
the outputs of the conclusion. In this way, the rules yield an algorithm. 

The distinction between scheme assignment 6 0 p -w sch p : a H 0 X and type 
inference 0 O p ~* p : r H ©i is that schemes are not inferred, only looked up 
in the context or explicitly annotated by the user. A single application rule allows 
an expression with a scheme to have its type inferred, by checking the vector of 
arguments (which may be empty) and completing the scheme to produce a type. 
Expressions with inferred types are embedded in those with assigned schemes 
because the head of an application may be a A-expression (i.e. in a /3-redex) or 
other expression that does not have an assigned scheme. 

Example of elaboration 

Recall the example inch term Xx.Xy.xy. This is elaborated by generating fresh 
metavariables for the domain types, so under the abstractions the context will be 
a [• ] : v *, x «,/?[•] : v *, y (3. The application x y is elaborated by looking 
up the type a of 2; in the context, and checking the vector y against it. Since 
the type does not start with a quantifier, fresh metavariables «o an d ct\ for the 
domain and codomain are created, and the constraint a ~ («o — > cti) passed 
to the constraint solver. Then y is checked at type cto, but looking up its type 
gives (3 and the subsumption judgment generates another constraint, (3 ~ ocq. 
Assuming no constraint solving, the resulting evidence term is 

\x:a. Xy:/3. (xt>() (yt>(') 

in the context 

«[•]:" *,/3[-]: V *,ao [• ] : V *, «i [• ] : V *, C [' ] : D a ~ (a 0 «i), C [• ] : D P ~ a 0 . 
In practice, the unifier will solve the constraints to give the context 
a 0 [• ] : V *, on [■ ] : V *, a. [• ] = a 0 ->■ ct\ : v *, (3 [■ ] = a 0 : V * 
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0 O h* p : a ~* p H 9i (checking p a£ scheme a and phase ^ delivers p) 



0 O , o :f t : a w e H 0i, a :f k,E 
9 0 h A t:(a:f K)^(i^Aa:\.eH ©i, (a :* k) / E 

©o, x : e a' h A t : a - e H ©i, x : e a - ', S 
0 O h A Ax.t : (x :f a')^a^ (Ax:"[a'\ . e) H 9i, (x : $ a') / E 

©o h A s ^ sch e:H9i ©i, x : A a h A t : a ~> e H 0 2 , x : A a, E 

©o h A (let:r = sint) : a' ^ (Ax: [^J . e') e H 0 2 ,S 

©o h A t ~^ sch e : a H ©! 
@i h e : cr -<! cr' ~> e' H 0 2 

0 O h^_:rw^H0 o ^[.] : *r 0 O h A t : a' e' H 0 2 

©o h T p -w p : r H @i ©i h r ~ f 7 H @ 2 

©o l~ T p : f ~^ p > 7 H © 2 

Figure 7.9: Type-checking elaboration 
with reflexive proofs of the coercion metavariables, and the evidence term 

\x:(a 0 ->■ «i) . Ay:a 0 • (z > (a 0 ->■ (?/> (a 0 ))- 
Parameterisation 

The operation A / S parameterises the metavariables S over a telescope A: 

A /* • ^ • 

A / (a [T] : $ k, E) ^ a [A, T] :* k, A /> E 

This allows a telescope of variables to be taken out of scope during elaboration, 
such that any metavariables introduced retain the appropriate parameters: if 
0, A, E h mctx then 0, A E h mctx. It permutes existential quantifiers 
from right to left past universal quantifiers, the 'raising' of Miller (1992). 

This definition and its uses involve a slight abuse of notation, as formally 
all occurrences of metavariables from E should be replaced with occurrences in 
which the parameters are prefixed by the identity substitution: for example, 
a -y e k, /3 [■ ] : v k h : v k but f3 [a : v re] : V K,fl:^h /3 [a] : v re. In practice, I will 
elide the necessary weakenings. 
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0 O p ^ sch p : a H 0! 



(p elaborates to p with assigned scheme a) 



0 O 3 x :f a $ * 

©o x ^ sch a: : o- H ©i 

©0 h CT (7 H @i 

@i h* p : a ~> p H @ 2 
©o (p:a) ^ sch p:trHe 2 



@o h* H -> sch H:H6i 

@o p ~* p : r H @i 
©o h* p ^ sch p : r H ©! 



©o h* p p : r H @i 



©o p ^ sch p : a H ©x 

©x (p : tr) 6 -w fj : r H © 2 

©o p 5 ^ p' : r H 0 2 
@o h v k : * -w /c H ©i 



(p elaborates to p with inferred type r) 

E 9 / [A] :* k $ ^ 

©o h 6 : A/* ^ 5 H 0 X 
e 0 7(6) - : [5/ A] H9 2 

©i, a k h v T : * r H 0 2 , a ^ k,E 



©o K V(o: k) ->■ T -w ( a: V) ->r:H0 2 ,(a: k)/H 



©o h v T : * ~» r H ©J 



©!, x :° r h v v : * <w « H © 2 , x £ r, S 



@o h v II(x:t) ^ u (x: n r) -^u:H0 2 ,(i: n r)/H 

©o h v T : * <w r H @i 0! h v v : * ~» v H @ 2 

9 o l- V T->u^T->t):H02 

©o, a [• ] : v *, x : A a h A t ~» e : r H @ 1; x : A a, S 
@o h A Ax . t -w (Ax : a . e) : (a ->■ r) H @i, H 

©o h A s ^ sch e:H6i ©i, x : A a h A t e' : r H 0 2 , x : A a, S 

©o h A (letx = sint) -w (Ax: [ffj . e') e : r H @ 2 ,S 



©o I"* _ ~* P ■ a H ©o, a [• ] : v *, /3 [• ] :* a 



Figure 7.10: Type-reconstructing elaboration 
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0 O I- CT ~> (J H ©i 



(scheme cr elaborates to a) 



0 O h v T : * r H 0! 



0 O h T -w r H ©! 

©o h v k : * w k H 0! ei,o:^hff-»ffHei,fl:*K,S 
0o h V(a: k). a^(fl:^)^H 9i, (a : V k) / S 

0 O h v k : * k H ©i @i, a^/cr-ff-wcrH @i, a k, E 



9 0 hV(o:K)^(T^(a:^)^H © 1; (a : v re) / E 

©o h v T : * ~* t H @! ©!, x r h a cr H ©i, x r, : 

©o h n(x:x) -fff^(i:°r)-fH6i, (x : n r) / S 

©o h v T : * ~> t H 0! ©!, x £ r h a ~* cr H ©i, x r, ! 

©o h n(x:T). :f rj-xrH ©i, (x : n r) / E 



©o h v T : * ~» y? H ©! 



.v 



9 0 h T a — (c :° <p) a H © 1; (c : D y?) / S 
©o h a' ^ cr' H @i e^a^HBi 

©0 h & ->• CT ^ (x J a) a H ©x 

Figure 7.11: Elaboration of type schemes 
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6 0 h* (p : a) b ~» p' : r H 0! 



(applying p : a to b results in p' : r) 

0 O h*'V:<7'-p'Hei 

0! h* (pp' : [p'/a] a) 6 -w p" : r H 0 2 

0 (p : a) • p : [o"J H 0 0 O (p : (a :f a') -> a) (p 7 , 5)^p":rH 0 2 

0 0 h*'* p' :k~> P ' H0i 

01 (pp' : [p'/a] a)b^ p" : r H 0 2 



0o h* (p : (a :f k) a) ({a = p'}, 5) ^ p" : r H 0 2 

©o, a [• ] : $ sh* (pa: [a/a] cr) 5 -w p' : r H 0i 
©o h* (p :(fl:f/t)^a)6^p':rH 0 X 

©o, a [• ] : v *, /3 [• ] : v * h u ~ (a -> /3) -w 7 H ©i 
@i (p > 7 : a ->■ (3) b -w p' : r H 0 2 

9 0 h* (p:w)5^/)':rHe 2 



©o h 5 : A 5 H ©! 



9h-: - -w • H © 

©o h* p : k ~* p H ©i 

@i h 6 : [p/a] A^H9 2 



(vector b in telescope A elaborates to 5) 

©o p : a -w p H ©i 

©i h 5 : [p/a] A^H0 2 

©o h p, 6 : a :* a, A -w p, <5 H @ 2 

8 0) a[-]:*Khfi: [a/a] A ~» 5 H © x 



0 o h{a = p},5: a:? k, A ~» p, 5 H 0 2 



©0 h 6 : a :f k, A ~> a,H © x 



Figure 7.12: Elaboration of spines and vectors 
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©o I- e : a -< a' -w e' H ©i 



0 h e : A |aj 



(cr is subsumed by a' , converting e to e' ) 
0ol~T~t;^7H6i 



9he:(7^(7^eH0i 9 0 h e : r ^ u el>7 H 9i 

6 0 , y :^ a' Q h y : (Tq -< tr 0 ~* e' H ©i 

81 h e e' : eg < a\ ~> e" j © 2 , y : A ^ 5 

0 O h e : (x : A ^) -> a x < (y : A <x 0 ) ^ a[ ^ \y: [a' 0 \ . e" H 0 2 ,H 

©ol _ '^~r^7H0i 

0i, & :J f h e (& > 7) : [b> 7/a] cr -< a' e' H 0 2 , & :J i>, S 
©o P e : (a :J r) -> a -< (& :J 4ff'^Ai: T u.e'He 2l (i: T !)) / 

©o, a :f k h e : cr -< a' ~* e' H ©i, a :f k, S 
©o P e : a -< (a :f k) -> </ ^ Ao:*K.e' H ©i, (a : $ /t)/» 

©o, a [• ] :* k h e a : [a/a] a" -< a 1 ~* e' H @i 
©o P e : (a :f k) a -< a ^ e H ©i 



Figure 7.13: Subsumption 

7.5.1 Unification 

The unification judgment 0 () |-r~w^7H0 1 means that unifying r with v 
in metacontext @ 0 produces the proof 7 in metacontext ©1. Conceptually, it is 
defined using the single rule 

0 O ,C[- ] :° t ~v ^* 0! 
Q 0 \-t~v-^(-\Oi 

where a new proof obligation ( (a metavariable at phase □, also known as a goal) 
is added to the metacontext and a backward chaining proof search procedure is 
invoked to take as many steps 0 -» ©' as possible, solving or simplifying goals. 

I will not define the proof search algorithm (the -» relation) fully, as my focus 
is on elaboration rather than constraint-solving, but a few comments on the steps 
it would take are in order. 

The basic inference rules for backward chaining are the coercion constructors. 
For example, if the metacontext includes a goal of type T\ r vi ~ t" 2 t -u 2 , then 
backward chaining on congruence of application would turn this into subgoals 
with types T\ ~ r 2 and v\ ~ v 2 . The coercion constructor allows a witness to 
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the original goal to be built from the subgoal metavariables. In this example, the 
metacontext 

0,C[A] : D T lVl ~r 2 v 2 

can be replaced with 

9, Co [A] : D n ~ r 2 , Ci [A] -P Vl ~ v 2 , C [A] := D conga T Co Ci- 

Similarly, other congruence rules can be used to decompose rigid-rigid constraints, 
the step p constructor can be used to reduce (compute) expressions and coherence 
can be used to remove coercions from equational goals. 

Flex-flex or flex-rigid constraints (between two metavariables or a metavari- 
able and a rigid term) can be solved by inversion and intersection, along the lines 
of the higher-order unification algorithm discussed in Section 4.2 (page 67). 

The local parameter telescope of a goal contains the hypotheses available for 
proving that goal, which may allow it to be solved or simplified via backward 
chaining. For example, the goal C [ c ;D b ~ a] : a a ~ b can be solved by C [c : D 
b ~ a] := a sym c. Introducing a hypothesis uses A-abstraction for coercions. 

Since the integers form an abelian group, constraint solving for linear integer 
constraints can follow the approach taken in Chapter 3. 

Assuming that the proof search algorithm is sound (i.e. all steps are identity 
metasubstitutions), its embedding into elaboration is sound: 

Lemma 7.3 (Soundness of unification). Suppose that for all 0 and ©', 0^0' 
implies 0 □ ©'. If @ 0 h r ~ v ~* 7 H ©1 then ©o E ©1 o,nd ©i h 7 : D r ~ v. 

Proof. By transitivity of metasubstitution and the typing rule for metavariables. 

□ 

7.5.2 Soundness of elaboration 

The elaboration algorithm is related back to the non-deterministic specification 
by the following theorem, which states that the algorithm produces one possible 
elaboration of the input term. 

Theorem 7.4 (Soundness of elaboration). Suppose \P G {V, n, X}. 

(a) If ©o h* p ^ sch p : a then 0 O C ©i and ©i h p ~* p :* a. 

(b) If ©o p p : r H & 1 then @ 0 □ ©1 and ©1 h p -w p : * r . 
^ // ©o h* p : a ~* p H & 1 then @ 0 C ©1 «^ © x h p ^ p :* a. 



164 



(d) If 0 O I- o- -w a H 61 £/ien 0 O E @i «™d ©i h cx ~* cr. 

(e,) // ©o V- p ~* p :* o and 0 O h* (p : tr) 6 p' : r H 61 £/ien ©o E ©1 and 
0! h p5 :* r. 

0 // 0 O h 5 : A ~» 5 H 0! tfien 0 O C ©i and 0 X h 5 5 : A. 

(pj 7/6 0 he:(T^(T'^e'H ©i £/ien 0 O E ©i and ©i h e : a -< e' : a'. 

Proof. By induction on derivations, using Lemma 7.3 for unification. □ 

7.6 Elaboration for case analysis 

The system I have presented so far lacks case analysis, which is rather important in 
practice. Therefore, I will now present the elaboration rules for case expressions, 
extending the previous non-deterministic and deterministic systems. 

I will consider only flat (non-nested) pattern matches; nested pattern match- 
ing is a well-studied topic (Augustsson, 1985) that can be presented via elabora- 
tion, but would complicate the presentation further. 

Moreover, I will continue to assume that case expressions are covering. It 
is easy to amend the elaboration rules for case expressions to insert missing 
branches that generate an appropriate runtime error. True coverage checking 
is less straightforward, because for each omitted data constructor the constraint 
solver must establish that the constraints it introduces are unsolvable. Goguen 
et al. (2006) suggest extending the language of patterns with 'refutations', which 
allow the programmer to indicate arguments that are uninhabited. 

The grammar of expressions p (and correspondingly terms t and types t) is 
extended by case and dcase expressions: 

p ::= (d)case pof br,* I ••• 
br ::= K vs — > p 
vs ::= • I x,vs | {a=b},vs 

Each branch br in the inch source language matches a single data constructor K 
and binds variables vs, some of which may be implicit. The syntax {a = b} means 
that the implicitly bound variable a in the telescope of the data constructor should 
be brought into scope with name b, that is, the right-hand side of the equation 
is the binding occurrence. 
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(p can elaborate to p with scheme a at phase ty) 



r h p-w p : * T h r : v * 

r hbr 0 -w 6r 0 :* Dt^' ► r ... r h br„ -w fcr„ :* D^ 1 ' ► r 
r h case pof br 0 ... br„ ~* case p of 6r 0 ... br n :* r 

rhp^e: n //*Dr rhr: v * 

r h br 0 6r 0 :* (e : Du^) ► r ... r h br n -w &r„ :* (e : Dv i i ) ► r 
r h dcase p of br 0 ... br„ ~* dcaseerof 6r 0 ... 6r„ :* r 



(br can elaborate to br at phase 



£ 9 K : (o,- K f , A) ->■ Do^ 

vs : [ i;,-/ Oj ] A -» A' 
r,A'hp^p:*r 



r h (Kvs p) -w (K [A'J p) :* Dui* ► r 



Thhr^br :* (e : D^) ►r 



£ 9 K : (a,- «< , A) ->• Da 



(for can elaborate to br at phase 

1 $^n//^ 



vs : [^/oi ]A//n -» A' 
A" = A',c:°e~(Kv- i i A') 
r, A" h p :* r 



T h (Kvs -)• p) -w (K LA"J -> p) :* (e : DtT) ► r 



vs : A -» A' 



vs : [x/y] A ^> A' 
x, vs : y :* <7, A -» x :* <7, A' 

vs : [ft/a] A ^> A' $ 7^ □ 

vs : o :f k, A -» a :f k, A' {a = &},vs : a :f k, A -» b :f k, A' 



vs : A ^> A' 



Figure 7.14: Non-deterministic elaboration of case expressions 
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7.6.1 Extending the non-deterministic system 

Figure 7.14 gives the new non-deterministic elaboration rules (extending those 
in Figure 7.4) for non-dependent and dependent case expressions. In each rule, 
the scrutinee is elaborated to give an expression of type Dili 1 , then each of the 
branches is elaborated and must deliver a common type r, the type of the whole 
expression, which must not depend on variables in any of the branches. For the 
dependent case, the scrutinee is elaborated at phase II / \& (rather than to 
ensure that it can appear in types and at runtime (if necessary). 

The judgment r h br -w br :* Dvi 1 ► r means that the case branch br can 
be elaborated to br, where the scrutinee has type DtJj ! , and the result has type 
t. Branches must be of the form Kvs — > p where K is a constructor of D and 
is accessible at the current phase. The implicit and explicit variable bindings vs 
are elaborated in the constructor's telescope A to produce another telescope A' 
that is in scope when the result of the branch is elaborated. For GADT matches, 
this telescope will include the equational constraints encoded by the GADT. 

Similarly, the judgment r h br ~» br (e : D^ 1 ) ► r means that the 
dependent case branch br can be elaborated to br, under the assumption that the 
scrutinee is equal to e. 

The judgment vs : A -» A' means that matching the source language 
bindings vs against the annotated telescope A of a data constructor results in the 
annotated telescope A'. This gives the telescope needed to elaborate the result 
of a case branch, using the binding names from vs but obtaining their types from 
A. Implicit bindings are introduced silently, or the user can explicitly bind a 
name that would usually be implicitly bound. This resembles the judgment for 
elaborating vectors in Figure 7.5, but for patterns rather than general expressions. 

As an example, consider the following definition of append via case analysis: 

append zs ys = case zs of 
Nil ->■ Nil 

Cons {m = m'} x xs — > Cons x (append xs ys) 

To check the Cons branch, recall that the type scheme for Cons (after the GADT 
translation), is (A) — > Vec a b where 

A = a -f *, b :V N, m \( N, x a, xs Vec am,c :° (b ~ Sue to). 

The bindings {m = to'}, successfully matched against the telescope A, 

renaming to to m! and introducing a, b and c implicitly. The branch result 
Cons x (append xs ys) is then elaborated under the renamed telescope of bindings. 
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Soundness of non-deterministic elaboration (Theorem 7.1) must be extended 
with the following additional cases: 

Lemma 7.5 (Soundness of non-deterministic elaboration for case analysis). 
(a) IfT hbr <w br :* Du^ ► r £/ien r h br :* Du^ ► r. 

(fej JjThbr^&r :* (e : D^ 1 ') ►t thenT^br :* (e : Dt^^ ► r. 

Proof. By structural induction, mutually with Theorem 7.1. □ 

7.6.2 Extending the deterministic system 

Extension of the deterministic elaboration system is mostly routine, following the 
non-deterministic system. Figure 7.15 gives the additional elaboration rules for 
case expressions (extending the rules in Figures 7.10 and 7.9). As in the non- 
deterministic system, auxiliary judgments Go I - * br : v ► r ~* br H ©i and 

00 I - * br : (e : v) ► r 6r H ©i describe elaboration for individual branches. 

1 write a list of semicolon-separated elaboration judgments to mean that the 
metacontexts are threaded through from one to another. 

In the deterministic system, case expressions are checked, rather than having 
a type inferred. Inference is dealt with by generating a fresh metavariable f3 and 
checking that the expression has type (3. It is not immediately apparent how 
the datatype D is to be determined: it might be obvious from the type v of the 
scrutinee, but it might not (if a constraint must be solved to show that v is an 
algebraic datatype). Alternatively, the types of the data constructors from the 
branches can be consulted, provided the case expression is non-empty. 

Soundness of elaboration (Theorem 7.4) is extended with the following: 

Lemma 7.6 (Soundness of elaboration for case expressions). 

(a) If ©o br : Du^ 4 ► r ~* br H ©! then ©i h br br :* Dv^' 1 ► r and 
©o E ©1- 

(b) If G 0 br : (e : Dw^) ► r 6r H © x f/ien ©i h br <w for :* (e : D^') ► r 
Proof. By structural induction on derivations, mutually with Theorem 7.4. □ 
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0 O h* p p : r H 0! (p elaborates to p with inferred type r) 

0 o ,/3[- ] : V * h* (d)casepofbr 0 ...br„ : /3 -w p H © x 
©o (d)case p of br 0 ... br„ -w p : f3 H 9i 



0 O p : a p H 9i (checking p a£ scheme a and phase ^ delivers p) 



©o p -w p : v H ©i 0i, a; [• ] : v h *u ~ D«7 8 -w 7 H 0 2 

0 2 br 0 : D57'' ► r -w 6r 0 ; ... ; br„ : Da^ ► r -w 6r n H 0 3 
0 O case p of br 0 ... br n : r ~* case p > 7 of 6r 0 ... 6r„ H 0 3 

0o h n//<3/ p^e:uH9i ©i, a,- [• ] : v Da7 1 ' ~» 7 H ©2 

© 2 br 0 : (g>7 : Da^) ► r ~» frr 0 ; br„ : (e > 7 : Dtt^ ► r ^ 6r„ H Q 3 

©o dcase pof br 0 ... br n : r -w dcasee > 7 of br 0 ... br n H @ 3 



©o br : v ► r -w br H ©i (case branch br elaborates to br) 



£ 9 K :* (a t - A) DaT $ ^ 

- i 



vs : [ui/oi ]A ^> A' 

80, A 7 h* p:r^pH ©i^A^S 

©o Kvs p : DuT ►rwKA^/)He 1 ,A/» 



©o I - * br : (e : t>) ► r ~* or H ©i (dependent case branch br elaborates to br) 



S 9 K :* (oi Ki , A) Dai 4 $^n//# 
vs : [^M 4 ']A//n -» A' 
A" = A', c £ ~ Kt7 8 A' 

©o.A" p:r ^pHG^A",- 

©o h* Kvs^ p: (e: D^) ► r -w K A' p H ©i, A'/ S 



Figure 7.15: Elaboration of case expressions 
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7.6.3 Example of elaborating a function definition 

Recall the replicate example from previous chapters: 

replicate :: Va :: * . II n :: N — > a — > Vec a n 
replicate n x = dcase n of 
Zero ->■ Nil 

Sue m — > Cons x (replicate m x) 

How will this be elaborated, as a shared function? Elaborating the type scheme 
produces (A) — > Vec a n where A = a :^*, n -P} N, x a, so the body should be 
elaborated at phase II in the context replicate [A] : n Vec a n, A. Thus recursive 
calls to replicate can be made at phase II, and its arguments are in scope. 

To elaborate the body, the dcase expression must be checked at type Vec a n. 
The scrutinee n is inferred to have type N. It must then be checked that each of 
the branches accepts this scrutinee and produces a result of type Vec a n. 

In the Zero branch, the constructor telescope is empty so the only variable 
brought into scope is an implicit proof c n ~ Zero. The result Nil must then 
be elaborated at type Vec a n under this hypothesis. Since its type scheme is 

(a n \( N, c :° n ~ Zero) — > Vec a n 

the rule for elaborating a term applied to a spine of arguments (empty, in this 
case) generates metavariables a, (3, ( for the implicit arguments, so Nil elaborates 
to Nil a (5 ( of inferred type Vec a (5 with the proof obligation ( : a (5 ~ Zero in 
the context. Subsumption allows this term to be checked at type Vec a n, adding 
another proof obligation (' P Vec a (3 ~ Vec a n and resulting in the term 
Nil a (5 ( > It should not be difficult for the unifier to solve a := a and (5 := n, 
so (' is reflexive. Then ( : a n ~ Zero can be solved by c. The final branch is: 

Zero (c : D n ~ Zero) — >■ Nil a n c> (Vec a n) 

The coercion by reflexivity can be removed. Other solutions to the constraints are 
possible, but they affect only the coercions, which are operationally irrelevant. 

In the Sue branch, the constructor has telescope y :£ N, and matching the 
source-level bindings against it gives m : (y N) / II -» m :" N so the match 
brings into scope m and a proof c :° n ~ Sue m. Insertion of implicit arguments 
proceeds similarly to the Nil case. The scheme A for the recursive call to replicate 
is supplied by the context, and is used to check its vector of arguments a, m, x. 
The final result of elaborating the branch (omitting coercions) is: 

Sue (m : n N, c : D n ~ Sue m) — >■ Cons a n m x (replicate (a, m, x)) c 
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7.7 Discussion 



In this chapter, I have presented an algorithm for elaborating the inch source 
language of Chapter 5 to the evidence language of Chapter 6. While it does not 
cover every feature available in Haskell, it does demonstrate the way in which an 
elaborator can be built up to cover a large source language, retaining confidence 
in the system through translation of source programs into an intermediate rep- 
resentation. The elaborator supports dependent H-types with type-refining case 
analysis, higher-rank types and GADTs, though the exact capabilities will depend 
on the underlying unification algorithm. I have also presented an approach to im- 
plicit argument synthesis that generalises the current Haskell policy of 'invisible 
types, visible terms' to allow for explicit type application and implicit n-types. 

7.7.1 Generalisation 

Chapter 2 demonstrated that generalisation of polymorphic let-definitions can be 
performed through 'skimming off' metavariables from the context after inferring 
the type of the definiens. Chapter 3 extended this to deal with abelian group 
unification. However, in the more complicated situation of inch elaboration, 
generalisation becomes yet more problematic. The presence of local constraints 
and parameterised metavariables means there is no reasonable way to decide 
which metavariables to generalise: attempting to generalise over parameterised 
metavariables leads to non-principal solutions. 

For example, suppose the expression being generalised has type a — > a, and 
the context suffix is a [■ ] : v *,£ [c : n /3 ~ Bool] : n a ~ Bool. In this case, the 
type of ( does not depend on its parameters, so we could discard the hypothesis 
c and generalise to produce a result of type (a ^ *) — > (z a ~ Bool) — > a — > a, 
i.e. Bool — y Bool up to isomorphism. However, if we refrain from generalising and 
later discover that a ~ (3 then the result has type a — > a for a an unconstrained 
metavariable. The order of constraint solving is now crucial, different solutions 
may be found as a result of slight variations in the program, and in general 
elaboration becomes fragile. 

What hope, then, for generalisation? In inch, I follow the advice of Vytiniotis 
et al. (2010) that local 'let should not be generalised'. 1 This strategy has the ad- 
vantage of simplicity, but other choices (some with a more heuristic character) are 
available. One might choose to generalise whenever a let-expression did not give 

1 Top level let-bindings can be generalised, because parameterised metavariables can cither 
be solved or reported as errors. 
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rise to parameterised metavariables at all, perhaps because no local constraints 
were introduced by case analysis of GADTs or subexpressions with higher-rank 
types, or because all the constraints introduced were solved by unification on the 
fly. This has the advantage of allowing generalisation in common cases, but it 
may be difficult for programmers to predict whether generalisation will take place 
without knowing the details of the inference algorithm. 

7.7.2 Related and future work 

The non-deterministic elaboration system is reminiscent of the approach taken in 
the Definition of Standard ML (Milner et al., 1997), which specifies elaboration 
via a syntax-directed inductive relation, but leaves matters such as the use of 
metavariables in type inference to implementations. Such a declarative specifi- 
cation can be turned into a logic program via mode assignment (Berghofer and 
Nipkow, 2002), with the underlying constraints solved by first-order unification. 
In the setting of this chapter, however, constraints are more complex and the 
non-deterministic system is not so easily operationalised. 

Brady (2013) describes elaboration for Idris in terms of imperative tactics, 
taking inspiration from the work of McBride (1999) on the Oleg system. 

A full specification of unification in such a rich setting is complex, and I have 
only outlined the way it fits into the elaboration framework. The careful manage- 
ment of variable scope means that unification could be specified similarly to the 
Miller pattern unification algorithm of Chapter 4. The algorithm used by GHC, 
described by Vytiniotis et al. (2011), is very powerful but not straightforward to 
understand or implement. Further work in this area is desirable. 

I have outlined the treatment of higher-rank types, but have not discussed the 
role of bidirectional type inference in detail. Dunfield and Krishnaswami (2013) 
give an excellent account of a sound and complete typechecking algorithm for 
higher-rank polymorphism, in a similar spirit. 

Soundness of the elaboration algorithm with respect to the non-deterministic 
specification is easy to show, and termination 2 follows from its structurally recur- 
sive definition, but it would be valuable to prove further properties. In particular, 
Luther (2003) discusses the coherence property, which requires that all possible 
(non-deterministic) elaborations of a term should be behaviourally equivalent. 
This formalises the intuition that elaboration should fill in details for which there 
is only one sensible choice. 

2 Termination in the sense of reduction to constraint solving, that is; termination of the 
constraint solver is another matter. 
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Chapter 8 
Applications 



In this chapter, the hard work of the previous chapters finally pays off. Having 
introduced the inch language and explained how to elaborate it into the evidence 
language, I now give examples of using it to write programs. I start with some 
familiar operations on vectors (8.1), before implementing merge sort (8.2) and left- 
leaning red-black tree insertion and deletion (8.3). I demonstrate an approach to 
checking the time complexity of function definitions (8.4). Finally, I show how 
to implement units of measure based on numeric constraints (8.5), in contrast to 
the built-in support for abelian groups described in Chapter 3. 

The inch preprocessor 

The examples in this chapter have been checked with a prototype implementation 
of inch 1 . The prototype consists of a preprocessor that typechecks a source file 
and converts it into type-correct GHC Haskell, erasing type dependencies. This 
means that certain features cannot be supported. For example, large eliminations 
(where types depend on shared terms) are impossible to implement. 

The prototype implementation differs from the language laid out in the previ- 
ous chapters in a number of respects. In particular, it retains a strong distinction 
between the term, type and kind levels, which limits its flexibility compared to 
the final design. The kind system consists only of *, Z and higher kinds; other 
promoted datatypes and kind polymorphism are not implemented. 

The language of shared expressions, that may occur in terms and in types, 
is heavily restricted: only integers and arithmetic operations are available. Sim- 
ilarly, type equality constraints may involve only integers, and GADTs may use 
only integer indices. The kind N is represented by Z with an inequality constraint. 

x http : //hackage . haskell . org/package/inch, https : //github . com/ adamgundry/ inch 



The flexible approach to implicit and explicit arguments based on type schemes, 
described in Section 7.1 (page 145), is not implemented. Rather, V-quantifiers 
are always implicit and H-quantifiers are always explicit, even though they are 
written with a dot (so II (m :: N) . r means II (m :: N) — > r). 

Terms that lie in the shared fragment must be marked with braces. This 
includes applications of Il-quantified functions and the patterns that define them. 
For example, if / :: II (re :: N) . Vec a n then / {x + 2} :: Vec a (x + 2). Otherwise, 
the syntax is broadly that of Haskell extended with kind signatures, scoped type 
variables, GADTs and higher-rank types. One minor extension is that multiple 
variables may share a kind signature: for example, V(m re :: N) . t is legal. 

Type inference is implemented along the lines of elaboration as described in 
Chapter 7, although instead of generating evidence terms, dependency-erased 
Haskell programs are produced. Constraint solving is based on the abelian group 
unification algorithm in Chapter 3, extended to the ring Z. Any remaining purely 
numeric constraints are checked using a decision procedure for Presburger arith- 
metic (Diatchki, 2011). This works well for linear constraints, but means that 
support for constraints involving multiplication is more limited. 

Kind inference is not performed, so kinds must be annotated explicitly (oth- 
erwise they default to *). This means that variables will usually be explicitly 
quantified. In a more complete implementation, this would not be necessary. 

Newtypes are not supported; where they are used in examples, they have been 
manually translated to the corresponding single-constructor data type behind the 
scenes. Support for typeclasses is extremely limited, and they will generally not 
be used in the examples. 

Despite these restrictions, it is still possible to implement useful examples. 
Where relevant, I will point out opportunities to improve the examples given a 
full-scale implementation of the inch system. 

8.1 Vectors 

Recall the definition of vectors as an indexed family of types: 2 

data Vec :: * — y N — > * where 
Nil ::VecaO 

Cons :: Va (re :: N) . a — > Vec a re — > Vec a (re + 1) 

2 Sensitive Haskell programmers may wonder why the kind of Vec is not N — > * — > *, since 
then Vec n is a monad for any n with the diagonal join, as shown in Subsection 5.2.4 (page 99). 
Unfortunately, this would make it harder to regard Vec as a type indexed by N, since Haskell 
treats type application as injective. 
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Here are some standard functions on vectors. The types of head and tail ensure 
they are never called on the empty vector, and lengths are tracked appropriately 
in the other cases. Most of the function definitions use polymorphic recursion 
and pattern-matching on GADTs, so their types must be specified. As discussed 
in Subsection 5.1.1 (page 90), the helper function for reverse implicitly requires a 
proof that (to + 1) + n ~ m + (n + 1), so additional constraint solving beyond 
the inductive definition of + is required. The lookup function demonstrates the 
use of Il-types: the index m must be supplied at runtime, but statically known 
to be below the length n. 

head :: V(n :: N) a .Wee a (n + 1) ->■ a 
head (Cons x _) = x 

tail :: V(n :: N) a. Vec a (n + 1) -> Vec a n 
tail (Cons _ xs) = xs 

append ::Va(mn::N). Vec a m — > Vec a n — > Vec a (m + n) 

append Nil ys = ys 

append (Cons x xs) ys = Cons x (append xs ys) 

reverse :: V(n :: N) a . Vec a n — > Vec a n 
reverse xs = help xs Nil 
where 

help :: V(m n :: N) a. Vec a m — > Vec a n — > Vec a (to + n) 

help Nil ys = ys 

help (Cons x xs) ys = help xs (Cons x ys) 

lookup :: V(n :: N) a . U (to :: N) . m < n =>■ Vec a n — >■ a 

lookup {0} (Cons x _) =x 

lookup { & + 1 } (Cons _ xs) = lookup { k} xs 

The fold for vectors has a rank-2 type, because for the Cons constructor it needs 
to abstract over the length to of the tail. Apart from the more informative type 
signature, it is essentially the same as the traditional foldr for lists. Indeed, it 
will erase to such a function at runtime. 

foldVec::V(/::N^ *) a (n::N). 

/ 0 ->■ (V(m :: N) . a ->■ / to ->■ / (to + 1)) ->■ Vec a n ->■ / n 
foldVec n c Nil = n 

foldVec n c (Cons x xs) = c x (foldVec n c xs) 

As one would expect, foldVec Nil Cons is well-typed and equal to the identity 
function on vectors. Unfortunately the usual definition of append via a fold, 
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append' xs ys = foldVec ys Cons xs 

does not typecheck, because of the lack of type-level A-abstraction. It is possible 
to work around this, at the cost of some syntactic overhead, using a newtype: 

newtype Plus a m n = Plus { unPlus :: a (m + n) } 

append" ::Va(mn::N). Vec a m — >■ Vec a n — > Vec a (m + n) 
append" xs ys = unPlus (foldVec (Plus ys) 

(\z zs — >■ Plus (Cons z (unPlus zs))) xs) 

8.2 Merge sort 

I now implement merge sort, based on a similar example by Altenkirch et al. 
(2005) in the dependently typed programming language Epigram (McBride and 
McKinna, 2004). The type of the sorting function guarantees that it preserves 
the length of the vector and returns a sorted result, if anything. No proof ma- 
nipulation is necessary, and the program erases to a natural implementation of 
merge sort for lists of integers. I do not show that the result is a permutation of 
the input, as this would require a more expressive type system; Xi (2008) does so 
for quicksort in ATS. On similar lines, Xi (1998) gives an example of merge sort 
in Dependent ML that verifies the length of the input is preserved. 

Of course, Haskell's type system does not check the totality of our programs, 
so this is only a partial correctness result. Higher-rank types allow me to express 
the fold-based recursion structure of the key functions, making the termination 
reasoning more obvious to the reader, if not the compiler. 

In principle, it is possible to express something similar using GADTs and 
type families, but the complexity of the implementation and the manual proofs 
involved would be much greater. Mu (2007) provides an impressive example that 
verifies length-preservation, but not ordering, in this manner. 

The point of this example is not that it is a verified implementation of merge 
sort, as there are many such programs already. Rather, it shows the utility 
of type-level numbers in Haskell and the ease with which they integrate with 
Haskell programming idioms (such as folds) and features (higher-rank types and 
polymorphic recursion) . 

A Tree is a leaf-labelled binary tree indexed by the number of leaves. Its 
construction ensures that it is balanced, as the subtrees of each node differ in size 
by at most one. 
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data Tree :: * — > N — > * where 
Empty :: Tree a 0 
Leaf :: a — > Tree a 1 

Even :: Va (n :: N) . 1 < n =>- Tree a n — > Tree a n — > 
Tree a (2 * n) 

Odd :: Va (n :: N) . 1 < n => Tree a (n + 1) ->■ Tree a n ->■ 
Tree a (2 * n + 1) 

Just like for vectors, the fold for trees uses higher-rank types. This version is 
slightly simplified, as it hides the distinction between even and odd nodes. 

foldTree ::V(/::N^ *) a (re :: N) . 

/ 0 ->■ (a ->■ / 1) ->■ (V(m re :: N) ./ m ->■ / re ->■ / (m + re)) ->■ 

Tree a n — >■ / n 
foldTree e / n Empty = e 
foldTree e / n (Leaf a) = I a 

foldTree e / n (Even x y) = n (foldTree e I n x) (foldTree e I n y) 
foldTree e I n (Odd x y) = n (foldTree e I n x) (foldTree tiny) 

A tree can be built by folding over a vector, replacing Nil with Empty and 
inserting elements using the balance-preserving insert function: 

mkTree :: Va (n :: N) . Vec a n — > Tree a n 
mkTree = foldVec Empty insert 
where 

insert :: Va (n :: N) . a — )• Tree a n — >■ Tree a (n + 1) 

insert i Empty = Leaf i 

insert i (Leaf j) = Even (Leaf i) (Leaf j) 

insert i (Even I r) = Odd (insert i I) r 

insert i (Odd Z r) = Even / (insert i r) 

A simple definition such as mkTree, which does not pattern-match on GADTs or 
use polymorphic recursion, does not need a top-level type signature. The bidirec- 
tional type inference algorithm is quite capable of inferring this type. However, I 
include the signature for consistency and clarity. 

Ordered vectors are indexed by lower and upper bounds, plus length. They are 
restricted to containing integers (by the Il-quantifier) . Ideally one should extend 
Z with top and bottom elements, to allow unbounded data. These restrictions 
derive from the limitations of the preprocessor; the theory given in Chapter 6 can 
support the general case. 
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data OVec ::Z-^Z-^N-^* where 

ONil :: V(Z «:: Z) . Z < u =>• OVec I u 0 
OCons :: V(Z u :: Z) (re :: N) . n (x :: Z) J < x =>> 
OVec x m n — >■ OVec + 

Given two ordered vectors, the merge function combines them to produce a 
single ordered vector. It uses the syntax for guards that introduce local con- 
straints described in Subsection 5.2.7. The second guard is redundant, but to see 
this the implementation would need to negate the results of previous tests when 
checking patterns, which is not currently supported. 

merge :: V(Z u :: Z) (to re :: N) . 

OVec I u to — >■ OVec I u n — >■ OVec I u (to + re) 
merge ONil ys = ys 

merge xs ONil = xs 

merge (OCons {x} xs) (OCons {y} ys) 

| {x ^ y} = OCons {x} (merge xs (OCons {y} ys)) 

| {x > y} = OCons {y} (merge (OCons {x} xs) ys) 

The type In I u represents integers in the interval [l,u]: 

data In :: Z — > Z — > * where 

In :: V(Z u :: Z) . II (x :: Z) . (Z < x, x ^ u) In I u 

The flatten function converts a binary tree of numbers in an interval to an ordered 
vector on that same interval, by invoking the higher-rank fold over the tree, calling 
merge at each node and converting each leaf value into a vector of length 1. 

flatten :: V(Z u :: Z) (to :: N) . I < u =^ Tree (In Z u) to ->■ OVec Z u to 
flatten = foldTree ONil invec merge 

where invec :: V(Z u :: Z) . In I n -> OVec Z u 1 
invec (In {%}) = OCons {i} ONil 

To merge sort a vector of numbers in an interval to produce an ordered vector, 
it is enough to construct and flatten a tree: 

sort :: V(Z u :: Z) (to :: N) . I < u =>■ Vec (In Z u) to — >■ OVec Z u to 
sort = flatten o mkTree 

Now evaluating sort (Cons (In {3}) (Cons (In {1}) (Cons (In {2}) Nil))) produces 
the sorted list OCons {1} (OCons {2} (OCons {3} ONil)) as expected. 
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8.3 Left-leaning red-black trees 



I now move on to a more advanced example data structure, red-black trees. A 
left-leaning red-black tree is a self-balancing binary search tree, designed to give 
good performance for insertion, deletion and membership test operations. 3 Every 
node is coloured either red or black, subject to the following invariants: 

1. All leaves, and both children of a red node, are black. 

2. The right child of a black node is black. 

3. Both children of an internal node have the same black height (the number 
of black nodes on any path to a leaf). 

There has been much research on implementing red-black trees in functional 
languages, building on foundations laid by Okasaki (1998), who dealt with inser- 
tion but not deletion. Might (n.d.) showed how to extend Okasaki's implemen- 
tation to deletion by adding two extra colours for tracking temporary invariant 
violations. Yamamoto (2011) applied Okasaki's work to left-leaning trees. 

Another strand of research focused on provably correct functional implemen- 
tations. Kahrs (2001) demonstrated an ingenious technique for enforcing the bal- 
ance invariant of red-black trees using the Haskell type system. Ek et al. (2011) 
used Agda to verify the binary search tree and colour invariants of left-leaning 
red-black tree insertion, and Oster (2011) extended this to deletion. 

Most implementations of red-black trees (both functional and imperative) 
work by constructing unbalanced trees and then applying a separate rebalancing 
operation. This does not work well when enforcing the invariants through the 
type system, because of the need to represent slightly malformed trees. Xi (2007) 
implemented red-black trees in ATS, following Okasaki's approach, indexing trees 
by the number of red-red colour violations they contain, and requiring that well- 
formed trees contain no colour violations. In this implementation, I will use 
McBride and McKinna's idea 4 of representing the path to the point where there 
would be an invariant violation using a Huet-style zipper. This avoids the need 
to represent trees that do not obey the invariants. 

The choice of left-leaning red-black trees here is not crucial. The technique of 
avoiding malformed trees using a zipper works well for other self-balancing binary 
search trees such as normal red-black trees or AVL trees. 

3 Left-lcaning red- black trees were introduced by Sedgewick (2008), as a simplification of the 
original red-black trees of Bayer (1972), obtained by omitting invariant 2. Regarding red nodes 
as part of their parent nodes, an LLRBT is a 2-3 tree; a normal red-black tree is a 2-3-4 tree. 

4 Rcd-black tree insertion was implemented as an example with the Epigram 1 distribution. 
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8.3.1 Enforcing red-black tree invariants via types 

To keep track of colours in the type system, I define the following singleton 
GADT. This is a limitation of the preprocessor: in a full implementation, one 
could simply define an algebraic data type for colours (or use the booleans) and 
use its constructors promoted to the type level, which would be slightly neater. 

type Black = 0 
type Red = 1 

data Colour :: Z — > * where 
Black :: Colour Black 
Red :: Colour Red 

The type RBTree represents well-formed red-black trees. Trees are indexed 
by lower and upper bounds, their colour and black height, and the type checker 
guarantees that all the invariants hold. Each leaf E stores a proof that its lower 
bound is strictly smaller than its upper bound, ensuring that the keys are stored in 
ascending order and there are no duplicated keys. There are separate constructors 
for red and black internal nodes (TR and TB respectively). The indexing ensures 
that the colour invariants are observed. A Il-type is used to store the key x on 
an internal node, so that the ordering invariant can be maintained. 

data RBTree :: Z -»• Z ->■ Z ->• N ->■ * where 
E :: V(i j :: Z) . i < j =>• RBTree i j Black 0 
TR:: V(i j::Z) (n :: N) . II (x :: Z) . 

RBTree i x Black n —?■ RBTree x j Black n — > RBTree i j Red n 
TB::V(ij c :: Z) (n :: N) . IT (x :: Z) . 

RBTree i x c n — >■ RBTree a; j Black n — >■ RBTree i j Black (n + 1) 

The interface that would exposed to the user of the red-black tree library hides 
the colour (always black) and the black height using the existential type RBT. 
However, the lower and upper bounds are visible. This distinguishes between 
invariants used only for the implementation of the library, which will change as 
nodes are inserted and deleted, from those relevant for the user. Alternative 
choices, such as concealing the bounds as well, are also possible. 

data RBT :: Z ->• Z ->■ * where 

RBT :: V(i j :: Z) (n :: N) . RBTree ij Black n ->■ RBT f j 



180 



Given the type RBTree, the corresponding type of one-hole contexts can be 
derived mechanically (McBride, 2001). These can be used to navigate a tree via 
a zipper (Huet, 1997). The type of one-hole contexts is indexed by two copies of 
the RBTree indices: those provided at the root, and those required at the hole. 
Since the root is always black, however, I can do away with one of the indices. 

data TreeZip :: Z — > Z — > N — > — root indices 

Z^Z^Z^N^ - hole indices 
* where 

Root :: V(z j :: Z) (n :: N) . TreeZip i j n i j Black n 
ZRL v.M {i j i! j' v.Z) (nn'::N).n(i::Z). 

TreeZip %' f n' i j Red n — > RBTree x j Black n — > 
TreeZip il f n' i x Black n 
ZRR ::V(i' f i j :: Z) (n' n :: N) . II (x :: Z) . 

RBTree i x Black n — > TreeZip i' j' n' i j Red n — > 
TreeZip i' f n' x j Black n 
ZBL ::V(f f ij c :: Z) (n' n::N) .U (x :: Z) . 

TreeZip %' f n' i j Black (n + 1) — >■ RBTree x j Black n — > 
TreeZip i' f n' i x c n 
ZBR ::V(f / ij c::Z) (n' n::N) .U (x :: Z) . 

RBTree i x c n ^ TreeZip i' / n' z j Black (n + 1) — >■ 
TreeZip f / n' x j Black n 

Given a context and a tree that fits in the hole, the whole tree can be rebuilt 
by plug. This function is well-typed because the indexing discipline of TreeZip 
exactly matches the demands of RBTree. This also could be obtained for free 
using generic programming techniques (Loh and Magalhaes, 2011). 

plug :: f ij c::Z) (n n' :: N) . 

RBTree i j c n — > TreeZip i! j 1 n' i j c n — >■ RBTree f' / Black n' 
plug i Root = t 

plug t (ZRL {x} zr) = plug (TR {x} t r) z 
plug i (ZRR {x} I z) = plug (TR {x} I t) z 
plug t (ZBL {x} zr) = plug (TB {x} f r) z 
plug f (ZBR {x} I z) = plug (TB {x} I t) z 
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8.3.2 Search 



When searching for a key x in a red-black tree, it can either be Found z t, where 
z is the context in which it was found and t is the subtree with x at the root, 
or Missing z, where z is the context that should have contained x. This detailed 
search result information will later be used to implement insertion and deletion. 

data SearchResult ::Z-^Z-^Z-^N-^* where 
Found :: V(x i' f ij c :: Z) (ri n::N). 

TreeZip %' f n' i j c n — >■ RBTree i j c n — >■ 
SearchResult x i' f n' 
Missing :: V(x %' f ij :: Z) (re' :: N) . (i < x, x < j) =>- 
TreeZip %' f re' i j Black 0 — > 
SearchResult x %' f n' 

To search a tree, a context is built up by comparing the key x to the value y stored 
at each node, and descending into the appropriate subtree, until the key is found 
or a leaf is reached. The invariants make it hard to get wrong: if a conditional 
test is omitted, or an invalid result returned, the typechecker will object. 

search :: V(i' / :: Z) (n 1 :: N) . II (x :: Z) . (i 1 <x,x< f) =>- 

RBTree i' / Black n' — >■ SearchResult x f' / n' 
search {x} = help Root 
where 



help :: V(i j c :: Z) (n :: N) . (i < x,x < j) ^> 

TreeZip i' j' n' z j c n — >■ RBTree i j c n 
SearchResult x i' j' n' 
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The user of the library can be presented with a simple membership test: 

member :: M (i j :: Z) . II (x :: Z) . (i < x, x < j) =^ RBT % j ->■ Bool 
member {x} (RBT t) = case search {x} t of 

Missing _ — > False 

Found — > True 
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8.3.3 Insertion 

To insert an element into a red-black tree, use search to find the appropriate 
location, then add a new node and proceed back up the tree, rebalancing on 
the way. The InsProb datatype represents the kind of problems that may be 
encountered when rebalancing the tree: either it is on the level (inserting a tree 
into a hole of the correct black height, though not necessarily the same colour) 
or in a panic (because a red child would have a red parent). 

data InsProb ::Z->Z->Z->N->* where 
Level ::V(i j c c' :: Z) (n :: N) . 

Colour c' —?■ RBTree i j c' n — >■ 
InsProb i j c n 
PanicRB :: V(i j :: Z) (n :: N) . II (x :: Z) . 

RBTree i x Red n — > RBTree x j Black n — > 
InsProb i j Red n 
PanicBR :: V(i j :: Z) (n :: N) . II (x :: Z) . 

RBTree i x Black n — > RBTree x j Red n —?■ 
InsProb i j Red n 

The insertRBT function searches for the element x, and if it is not present, calls 
the ins function defined below with the appropriate insertion problem. 

insertRBT ::V(ij::Z).n (x :: Z) . (i < x, x < j) =>• 

RBT i j ->■ RBT i j 
insertRBT {x} (RBT t) = solvelns (search {x} t) 
where 

solvelns :: V(n :: N) . SearchResult x i j n — > RBT i j 
solvelns (Missing 2;) = ins (Level Red (TR {x} E E)) ^ 
solvelns (Found ) = RBT t 

To solve an insertion problem, move out through the context, updating the prob- 
lem appropriately at each step. While this definition looks intimidating (!), the 
types make it difficult to get wrong: the typechecker will object if a tree is ever 
constructed that breaks the invariants (either ordering or colouring). It is much 
easier to construct interactively than in batch mode. In fact, I first implemented 
it in Agda using the support for interactive construction, then transcribed it for 
inch. The Agsy proof search tool (Lindblad and Benke, 2006) is able to fill in 
many cases automatically, further reducing the effort involved. 



183 



ins::V(i'/ ij c::Z) (ri n :: N) . 

InsProb z j c n — >■ TreeZip f / n' z j c n — » RBT £' / 
ins (Level Red (TR {x} t 0 k)) Root = RBT (TB {x} t 0 k) 
ins (Level Black t) Root = RBT t 

ins (Level Red t) (ZRL {x} z t') = ins (PanicRB {x} 1 1') z 

ins (Level Red t) (ZRR {x} t' z) = ins (PanicBR {x} t' t) z 

ins (Level Black t) (ZRL {x} z t!) = ins (Level Red (TR {x} t t')) z 

ins (Level Black t) (ZRR {x} t' z) = ins (Level Red (TR {x} t' t)) z 

ins (Level c t) (ZBL {x} z t') = ins (Level Black (TB {x} t t')) z 

ins (Level Black t) (ZBR {x} t' z) = ins (Level Black (TB {x} t' t)) z 

ins (Level Red (TR {y} h t 2 )) (ZBR {x} Ez) = 

RBT (plug (TB {y} (TR {x} E ti) t 2 ) z) 
ins (Level Red (TR {y} h t 2 )) (ZBR {x} (TB {w} t t') z) = 

RBT (plug (TB {y} (TR {x} (TB {^} t t') h) t 2 ) z) 
ins (Level Red (TR {y} h t 2 )) (ZBR {x} (TR {w} t t') z) = 

ins (Level Red (TR {x} (TB {w} t t') (TB {y} t x t 2 ))) z 

ins (PanicRB {y} (TR {w} k) h) (ZBL {x} ^) = 
ins (Level Red (TR {y} (TB {w} t 0 k) (TB {x} t 2 t))) z 

ins (PanicBR {y} t 0 (TR {w} k k)) (ZBL {x} z t) = 
ins (Level Red (TR {w} (TB {y} t 0 «i) (TB {x} t 2 t))) z 

ins (PanicRB {y} (TR {«;} to k) h) (ZBR {x} f = 
ins (Level Red (TR {w} (TB {x} t t 0 ) (TB {?/} ^ t 2 ))) z 

ins (PanicBR {y} to (TR {w} ^ fe)) (ZBR {x} t z) = 
ins (Level Red (TR {y} (TB {x} t t 0 ) (TB {w} k fe))) ^ 



8.3.4 Deletion 

Deleting a key from a red-black tree is slightly more complicated than insertion. 
The search function positions the focus on the node to be deleted, then calls 
delFocus, assuming the key exists. 

delete :: V(i j ::Z).U (x :: Z) .{i<x,x< j) =>• RBT i j -> RBT i j 
delete {x} (RBT = solveDel (search {x} t) 
where 

solveDel :: V(n :: N) . SearchResult x i j n —?■ RBT i j 
solveDel (Missing _) = RBT t 
solveDel (Found z t) = delFocus t z 



184 



To delete the node at the focus, provided the right subtree has black height 1 
or more, the deleted key can be replaced with the minimum of its right subtree 
(using findMin defined below). The base cases (where the right subtree has black 
height zero) are handled individually. 

delFocus :: V(i' / i j c :: Z) (ri n :: N) . 

RBTree i j c n — >■ TreeZip %' f ri i j c n — >■ RBT i' f 
delFocus E z — RBT (plug E z) 

delFocus (TR {x} EE) z = RBT (plug E (wantBlack z)) 

delFocus (TB {x} EE) z = del E z 

delFocus (TB {x} (TR {y} EE) E) z = RBT (plug (TB {y} E E) z) 
delFocus (TR {x} to (TB {y} h t 2 )) z = 

findMin (TB {y} t x t 2 ) (\{k) ZRR {A:} (wkTree fe) z) 
delFocus (TB {x} to (TB {?/} t 2 )) z = 

findMin (TB {y} t x t 2 ) {\{k} ZBR {£;} (wkTree fe) ^) 

The only context in which a red node can occur is the left child of a black node, 
which also accepts black nodes. Thus the wantBlack function can change the type 
from the former to the latter. 

wantBlack ::V(z'/ ij::Z) (ri n :: N) . 

TreeZip %' f ri i j Red n — > TreeZip %' f ri i j Black n 
wantBlack (ZBL {x} z r) = ZBL {x} z r 

Deletion may require the upper bound of a subtree to be weakened, which needs 
a traversal of its right spine to satisfy the type checker. This could be replaced 
with unsafeCoerce, since the inequality proofs being manipulated are not retained 
at runtime, so it is operationally the identity function. 

wkTree :: V(« j f c n :: Z) . j < f =^ RBTree i j c n — >■ RBTree i f c n 

wkTree E = E 

wkTree (TR {x} to ti) — TR {x} to (wkTree t t ) 

wkTree (TB {x} to t 1 ) = JB{x} to (wkTree h) 

The findMin function works inside the right subtree of the node whose key is 
being deleted, looking for the minimum key, which will be used to replace the 
deleted one. The zipper context abstracts over the (as yet unknown) minimum 
key. If the minimum is found on a red node, it can simply be removed and the 
tree be reconstructed. However, if the minimum is on a black node or leaf, then 
the del function is called to decrease the black height. 
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findMin :: V(i' f ij c :: Z) (n' n ::N) . RBTree i j c (n + 1) -»• 

(II (k :: Z) . i < k => TreeZip i' f n' k j c (n + 1)) ->• 
RBT i' / 

findMin (TB {x} EE) / = del E (/ {x}) 

findMin (TB {x} (TR {y} E E) t) f = RBT (plug E (ZBL {x} (/ {y}) t)) 
findMin (TR {x} (TB {y} EE) t)f = del E (ZRL {x} (/ {y}) *) 
findMin (TR {x} (TB {y} (TR {k} E E) fe) fc) / = 
RBT (plug E (ZBL {y} (ZRL {x} (f {k}) k) t 0 )) 

findMin (TR {x} (TB {y} (TB {^} to k) t 2 ) t 3 ) f = 

findMin (TB {w} t 0 k) {\{k} ZBL {y} (ZRL {x} (f {A;}) fe) t 2 ) 

findMin (TB {x} (TB {y} to k) h) f = 

findMin (TB {y} t 0 k) (\{k} -+ ZBL {x} (/ {A;}) t 2 ) 

When deleting a black leaf (either directly or because it is the minimum in the 
right subtree of a deleted internal node), the black height must be decremented. 
Generally, the problem is to fit a tree of black height n into a hole that expects 
a tree of height n + 1. The del function works its way upwards, rebalancing after 
deletion, in a similar way to ins. Again, this definition is much easier to write 
than to read, thanks to the automation tool Agsy (Lindblad and Benke, 2006). 

del :: V(i' f i j :: Z) (n 1 re :: N) . RBTree i j Black re -»• 

TreeZip i' / re' i j Black (n + 1) ->■ RBT i' / 
del * Root = RBT £ 

del f (ZRL {x} z (TB {y} to k)) = colourOf to 

(RBT (plug (TB {y} (TR {x} t t 0 ) k) (wantBlack z))) 

(\M *o *o -»> RBT (plug (TR {^} (TB {x} t Q (TB {y} t' 0 ' k)) z)) 

del f (ZRR {x} (TB {?/} k) z) = colourOf to 

(RBT (plug (TB {x} (TR {y} t 0 k) t) (wantBlack z))) 

(\M *o *o RBT (plug (TR {y} (TB {w} t> 0 Q (TB {*} k «)) z)) 

del f (ZBL {x} z (TB {?/} to k)) = colourOf to 
(del (TB {y} (TR {x} t h) k) z) 

(\M <o RBT (plug (TB (TB {*} « (TB {y} t>> k)) z)) 
del t (ZBR {x} (TR {?/} to (TB {w} ^ fe)) z) = colourOf k 

(RBT (plug (TB {y} t 0 (TB {x} (TR {™} fc) t)) z)) 

(\{v} t[ t'( RBT (plug (TB {w} (TR {y} t 0 (TB {</} t[ 

(TB {x} h t)) z)) 
del * (ZBR {x} (TB {y} k) z) = colourOf to 

(del (TB {x} (TR {y} to k) t) z) 

(\ {w} t > o t » ^ RBT (plug (TB {y} (TB {w} t' 0 Q (TB {x} k t)) z)) 
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The colourOf eliminator determines if a tree is red or black, and calls the 
corresponding argument. For red trees, it provides the children of the node to 
the callback. This reduces the number of cases in del, because each case depends 
on the colour of a subtree, but not whether it is a leaf or an internal node. 

colourOf :: Va (i j c n :: Z) . 
RBTree i j c n — >■ 
((c ~ Black) =>• a) ->• 

((c ~ Red) =>• n (x :: Z) . RBTree i x Black n -»• 

RBTree ir j Black n — >■ a) — > a 
colourOf E b g = b 

colourOf (TB {x} _ _) & # = b 
colourOf (TR {x} t 0 ti) b g = g {x} to t x 



8.4 Tracking time complexity 

Danielsson (2008) introduced the Thunk library for verifying the time complex- 
ity of purely functional data structures in the dependently typed programming 
language Agda. He indexes a monad by the number of computation steps re- 
quired to deliver a value in weak head normal form. Function definitions must 
be annotated with calls to an operation that increments this number. 

newtype Cost in :: N) a — Hide {force :: a} 

The implementation of Cost and the primitive functions on it are hidden, 
because Cost is really a newtype with phantom type parameter n. This avoids 
runtime overhead, but if it was exposed to the user then the library invariants 
would be easily violated. Agda provides a language construct abstract to support 
this, and a similar abstraction barrier can be created in Haskell using modules. 

The return and bind functions witness the fact that Cost is a monad indexed 
by the monoid (N, +). That is, any value can be computed in no steps, and if 
some a can be computed in m steps, and used to compute some b in n steps, then 
the overall computation takes m + n steps. 

return :: a — > Cost 0 a 
return = Hide 

bind :: V(m n :: N) a b . Cost m a — > (a — > Cost n b) — >■ Cost (m + n) b 
bind (Hide x) f — wait (/ x) 
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If a value can be computed in m steps, then it can be computed in n steps for 
any n larger than m. Unlike Danielsson's version, which requires the caller to 
specify a number of steps to wait, this exploits the inequality constraints of inch 
to provide a more flexible interface. 

wait :: V(m n :: N) a .m ^ n =>- Cost m a — > Cost n a 
wait (Hide a) = Hide a 

A crucial part of the methodology is to annotate every line of every function 
definition being counted with a call to tick, which increments the counter. 

tick :: V(n :: N) a. Cost n a — > Cost (n + 1) a 
tick = wait 

A useful helper function, returnW, allows a value to be injected into the monad 
with an arbitrary weakening of the upper bound. 

returnW :: V(n :: N) a . a — > Cost n a 
returnW x = wait (return x) 

Danielsson's approach works well for verifying the time complexity of the merge 
sort and red-black tree operations defined in the previous sections. The inch 
constraint solver is able to deal with the proof obligations automatically, rather 
than requiring the user to supply proofs of trivial arithmetic properties. There are 
some obligations on the user of the library not captured by the types: every user 
function must be annotated with calls to tick, the force function must not occur 
inside code being timed, and library functions must not be partially applied. 

To show how the approach works, I will reimplement red-black tree search 
with complexity annotations, proving that the time for the membership test is 
linear in the height of the tree. 5 



5 That is, it is logarithmic in the number of elements. 
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First, the data type declaration for the zipper must have an extra index, to 
count its depth. This is needed to express some of the complexity invariants that 
the helper functions satisfy. 

data TreeZip' :: Z — > Z — > N — > — root indices 

Z^Z^Z^N^ - hole indices 
N ->■ - depth 

* where 

Root' :: W(i j :: Z) (n :: N) . TreeZip' % j n i j Black n 0 
ZRL' ::V(iji'/::Z) (n n' d :: N) . n (x :: Z) . 

TreeZip' i' f n' i j Red n d — >■ RBTree x j Black n — >■ 
TreeZip' i' j' n' i x Black n (c? + 1) 
ZRR' ::V(i' j' ij::Z) (n' n d :: N) . n (x :: Z) . 

RBTree i x Black n — >■ TreeZip' %' f n' i j Red n d — >■ 
TreeZip' i' j' n' x j Black n (c? + 1) 
ZBL' ::V(i' j' ij c::Z) (n' n d :: N) . II (x :: Z) . 

TreeZip' i' f n' i j Black (n + 1) e£ — >■ RBTree x j Black n — > 
TreeZip' i' f n' i x c n (d + 1) 
ZBR' ::V(i'/ ij c::Z) (n' n o? :: N) . IT (x :: Z) . 

RBTree «' x c n — >■ TreeZip' i' j' n' i j Black (n + 1) c? — )• 
TreeZip' i' j' n' x j Black n (d + 1) 

The Search Result type packs up the extra index, but is otherwise unchanged. 

data SearchResult' ::Z-»Z-»Z-»Nh>* where 
Found' :: V(x i' j 1 i j c :: Z) (n' nd::N). 

TreeZip' i' f n' i j c n d — >■ RBTree i j c n — > 
SearchResult' x i' f n' 
Missing' :: V(x i' j' i j :: Z) (n' c? :: N) . (i < x,x < j) ^ 
TreeZip' i' f n' i j Black 0 d — > 
SearchResult' x i' f n' 

The searchCost function returns a result in the Cost monad, showing that it takes 
2n' + 2 steps where n' is the black height of the tree. Some work is needed to 
choose the appropriate invariant to be maintained in the helper function. In 
this case, the invariant depends on the colour of the tree, so a separate helper 
function is needed when the subtree is black. The lines of the helper functions 
are annotated with calls to tick. Pure values are inserted into the Cost monad 
with returnW. The wait function is used to weaken a bound where the result is 
computed more quickly than the type requires. 
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searchCost :: V(Y / :: Z) (re' ::N).U (x :: Z) . (i' <x,x< f) 
RBTree i' f Black re' -»• 

Cost (2 * n' + 2) (Search Result' x £' ;' re') 
searchCost {x} t = tick (helpB Root' t) 
where 

help :: V(z" j c::Z) (re d::N). 

((1 + (2 * re) + d) < (2 * re'), i < x, x < j) => 
TreeZip' i' f re' % j c re d — >■ RBTree i j c n —t 
Cost (2 + 2 * re) (Search Result' x i' j' re') 



help z 


E 






= tick (returnW (Missing' z)) 




help z 


(TR {y} I r) | 


{x 


< y} 


= tick (helpB (ZRL' {y} z r) I) 




help z 


(TR {y} I r) | 


{x 


~ y} 


= tick (returnW (Found' z (TR {y} I 


r))) 


help z 


(TR {y} I r) | 


{x 


> y} 


= tick (helpB (ZRR' {y} I z) r) 




help z 


(TB {y} I r) | 


{x 


< y} 


= tick (wait (help (ZBL' {y} z r) I)) 




help z 


(TB {y}Zr) | 


{x 


~ y} 


= tick (returnW (Found' z (TB {y} I 


r))) 


help z 


(TB{y}Zr)| 


{x 


> 2/} 


= tick (wait (help (ZBR' {y} I z) r)) 




helpB 


:: W(i j :: Z) (re 


d: 


N) . 







(((2 * re) + d) ^ (2 * re'), f < x, x < j) =>> 
TreeZip' %' f re' i j Black re c? — > RBTree i j Black re — >■ 
Cost (2 * re + 1) (Search Result' x f' j' re') 
helpB zE = tick (returnW (Missing' z)) 

helpB ^ (TB {y} I r) \ {x < y} = tick (help (ZBL' {y} z r) I) 
helpB z (TB {y} I r) | {x ~ y} = tick (returnW (Found' z (TB {y} I r))) 
helpB ^ (TB {y} / r) | {x > y} = tick (help (ZBR' {y} I z) r) 

The membership test can be implemented as before, inserting the necessary 
monadic plumbing and calls to tick. Thus it returns a result in 2n + 4 steps. 

memberCost :: V(f j :: Z) (re :: N) .11 (x :: Z) . (i < x, x < j) =>• 

RBTree i j Black re ->■ Cost (2 * re + 4) Bool 
memberCost {x} t = tick (bind (searchCost {x} t) /) 
where 

/ :: Search Result' x z j re — > Cost 1 Bool 
/ (Missing' _) = tick (return False) 
/ (Found' ) = tick (return True) 

The force function can be used to escape the Cost monad and acquire a value. 
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member' ::V(ij::Z). II (x :: Z) . (i < x, x < j) RBT i j -> Bool 
member' {x} (RBT t) = force (memberCost {x} t) 

This approach can be used to show that both insertion and deletion are linear 
in the black height of the tree. The types given to the main functions are: 

insert :: V(i j :: Z) (n::N).n (x :: Z) . (i < x, x < j) =^ 
Tree i j Black n Cost (4 * n + 6) (RBT i j) 

delete ::V(ij :: Z) (ra::N).II (x :: Z) . (i < x, x < j) =>- 
Tree j Black n — )• Cost (5 * n + 6) (RBT z j) 

As in the member example, the main difficulty is in choosing appropriate in- 
variants; the annotation is routine. Interactive program construction makes this 
easier, as it enables exploratory programming. 

8.5 Units of measure 

This section demonstrates a use for type-level integers, rather than natural num- 
bers: a library for representing units of measure. Unlike the approach taken in 
Chapter 3, which requires a language extension but can support an arbitrary set 
of base units, this library can be implemented using existing features of inch, but 
the base units must be fixed ahead of time. Moreover, type errors will reveal 
the underlying representation of units, rather than being expressed in an easy- 
to-understand format. The dimensional package of Buckwalter (n.d.) is a much 
more comprehensive implementation of units of measure using this approach, but 
with type-level integers implemented via existing features of GHC Haskell. 

The Unit constructor has arguments for the powers of three base units (metres, 
seconds and kilograms). A real units of measure implementation would supply 
more base units, but the number would still be fixed. The Quantity newtype wraps 
a numeric value, and has a phantom type parameter that will be instantiated with 
some application of Unit. This separation makes it easy to write functions that 
are completely polymorphic in the units. 

data Unit ::Z^Z^Z^* 
newtype Quantity u a = Q {value :: a} 

The Q constructor should not be exported from the module in which it is 
defined, in order to prevent clients of the library from changing units arbitrarily. 
Instead, all access must be through the functions defined below. 
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In the full inch system, with support for promoted datatypes, Unit would be a 
data constructor rather than a type constructor. Type synonyms can be defined 
for common units. If type families or type-level functions were available, one 
could define operations to combine units (such as multiplication of units). 

type Dimensionless = Unit 0 0 0 
type Metres = Unit 1 0 0 

type Seconds = Unit 0 10 

type Kilograms = Unit 0 0 1 

type MetresPerSecond = Unit 1 (-1) 0 
type Newtons = Unit 1 (-2) 1 

Users of the library will have access to smart constructors, which wrap the un- 
derlying newtype constructor Q, but specify the units. 



dimensionless 
metres 
seconds 
kilograms 



: a — > Quantity Dimensionless a 
: a — > Quantity Metres a 
: a — > Quantity Seconds a 
: a — > Quantity Kilograms a 



(dimensionless, metres, seconds, kilograms) = (Q, Q, Q, Q) 

The usual arithmetic operations can be defined on quantities, with the types 
ensuring that the units are respected. However, Quantity u a cannot be made an 
instance of the Num typeclass, because multiplication does not preserve units. 

plus :: Num a =>- Quantity u a — > Quantity u a — > Quantity u a 

plus (Q x) (Q y) = Q (x + y) 

minus :: Num a Quantity u a — > Quantity u a — > Quantity u a 
minus (Q x) (Q y) = Q (x - y) 

The type signatures of the following operations would be significantly simpler if 
type-level functions could be defined. 

times :: V(m s g m' s' g' :: Z) a. Num a =^ 

Quantity (Unit m s g) a — >■ Quantity (Unit ml s' g') a — > 
Quantity (Unit (m + m') (s + s') (g + g')) a 
times (Q x) (Q y) = Q (x * y) 

inv :: V(m s g :: Z) a . Fractional a =^ 

Quantity (Unit m s g) a — >■ Quantity (Unit (— m) (— s) (—g)) a 
inv (Q x) = Q (recip x) 
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over :: V (m s g m' s' g' :: Z) a . (Num a, Fractional a) =>- 
Quantity (Unit m s g) a — >■ Quantity (Unit to' s' </) a — )• 
Quantity (Unit (m — m') (s — s') (g — g')) a 
over x y = times x (inv y) 

pow ::V(msj::Z) a . Fractional a =>- 
II (A; :: N) . Quantity (Unit m s g) a — >■ 
Quantity (Unit (A; * m) (A; * s) (A; * ^)) a 
pow {k} (Q x) = Q (x~~k) 

Scaling a quantity by a dimensionless constant is useful: 

scale :: Num a =^ a — > Quantity u a — >■ Quantity u a 
scale x (Q y) = Q (x* y) 

minutes = scale 60 o seconds 
hours = scale 60 o minutes 

More generally, unit prefixes can be written as transformers of the constructors 
that scale by an appropriate constant: 

type Prefix u a = (a — >■ Quantity u a) — > a — > Quantity u a 

prefix :: Num a =5- a — > Prefix u a 
prefix n f — scale no f 

kilo = prefix 1000 
centi = prefix (recip 100) 
milli = prefix (recip 1000) 

This allows prefixed units to be expressed neatly: 

km = kilo metres 
cm = centi metres 
mm = milli metres 

Finally, a special case of flipped application allows expressions such as units 3 cm 
and units 15 km 'over' units 3 hours. 

units :: a — > (a — > Quantity u b) — >■ Quantity u b 
units x f = f x 

As an example of using the library, here is a variant of the function from the 
introduction to Chapter 3 that calculates the distance travelled over time by an 
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object with a fixed initial velocity and constant acceleration. The top-level type 
annotation is entirely optional. 

distanceTravelled :: (Num a, Fractional a) =>- 

Quantity Seconds a — > Quantity Metres a 
distanceTravelled t = plus (times vel t) (times accel (pow {2} t)) 
where 

vel = over (units 20 metres) (units 1 seconds) 

accel = over (units 36 metres) (pow {2} (units 1 seconds)) 

Kennedy (2010, §3.10) gave an example of a function whose type cannot be 
inferred by the units-of-measure type system in F#, because of difficulties with 
generalisation, as explained in Subsection 3.0.1. 

trouble = \ x — >■ let d — over x 

in (d mass, d time) 

where 
mass = units 5 kilograms 
time = units 3 seconds 

The inch system has no trouble inferring the most general type for this function 

trouble :: Va (m :: Z) (s :: Z) (g :: Z) . 
(Num a, Fractional a) 
Quantity (Unit m s g) a — > 

(Quantity (Unit m s (g — 1)) a, Quantity (Unit m (s — 1) g) a) 

although the fixed basis of units means it is more limited than Kennedy's solution 
or the algorithm in Chapter 3. 
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Chapter 9 
Conclusion 



The inch language described in this thesis is an experiment in re-imagining GHC 
Haskell. It shows how insights from work on dependent type theory can con- 
tribute to the development of Haskell's type system, intermediate language and 
elaboration process. It is not intended as a finished product or a rival system; 
rather, I have investigated some of the ways in which Haskell might develop. 

Haskell as implemented in GHC is a moving target, with new language ex- 
tensions being introduced frequently. The recent enrichment of the kind system 
with polymorphism and datatype promotion paves the way for the identification 
of kinds with types, a key aspect of the design of inch, and work to implement 
this is ongoing. Weirich et al. (2013) and the evidence language of Chapter 6 
show that this gives a reasonable core language; the discussion of elaboration in 
Chapter 7 gives some idea of how type inference will continue to work. 

The addition of n-types to the language offers the possibility of significantly 
simplifying Haskell programming with dependent types. In particular, it avoids 
the need for singleton constructions that result in many incompatible names for 
essentially the same object. If Haskell's type system is to become more depen- 
dent, the key requirement is for the operational semantics of the term and type 
levels to be aligned, breaking the strict distinction between functions and type 
families. The shared functions of this thesis offer a possible way forward. While 
not requiring the identification of kinds and types, n-types are much more useful 
if the identification is made, since then indexed datatypes can be quantified over. 

Another key aspect of Chapter 7 is the increased flexibility it offers for which 
arguments are expected to be inferred by the machine. Milner's compromise, 
particularly the insistence that type-level expressions be invisible in terms, is 
no longer tenable in a world of advanced type-level features. By providing the 
machine with a small amount of help, we can gain significant expressive power. 



The case for permitting explicit type application and quantification grows ever 
stronger. Il-types benefit from case-by-case decisions on whether they should be 
explicit or implicit, and extending the same mechanism to V-quantifiers seems 
natural. In any case, wherever an argument is supposed to be inferred by the 
machine, it should be possible for the user to supply it. 

Type inference and unification with nontrivial equational theories has been a 
key theme of the first part of this thesis, including the theory of abelian groups for 
units of measure in Chapter 3 and the theory of /^-conversion in Chapter 4. A 
desirable feature for a system of type-level numbers is automatically solving the 
constraints that arise, and the abelian group structure of the integers provides 
a starting point for this, though the presence of local hypotheses complicates 
matters and more research is needed. As I have outlined, the careful management 
of variable scope (using dependency-ordered contexts) can help make it clear how 
to solve constraints in a most general fashion. 

Elaboration of full-spectrum dependently-typed languages is another topic in 
need of further work, as practical implementations are not always theoretically 
well- understood. I hope that the higher-order unification algorithm in Chapter 4 
may provide a useful base for describing elaboration more precisely. 



196 



Appendix A 



Reference implementation of 
Hindley-Milner type inference 

In this appendix and the two that follow I will present reference implementa- 
tions for the unification and type inference algorithms described in Part I of this 
thesis. The implementations are presented in literate Haskell, and I will take 
slight liberties with the Haskell syntax. In particular, I will use italicised capital 
letters (e.g. A) for Haskell variables, while sans-serif capital letters (e.g. A) will 
continue to stand for data constructors. This allows me to retain more of the 
syntactic conventions of the earlier chapters, such as using 0 for a metacontext 
and A for an object language type. I will omit boilerplate code such as module 
import lists and straightforward typeclass instances, and routine support code 
for pretty-printing and testing. 

The code has been tested using version 7.6.3 of the Glasgow Haskell Compiler, 
with version 2013.2.0.0 of the Haskell Platform and version 0.6 of the Strathclyde 
Haskell Enhancement (McBride, 2010b). It is available online 1 and with the 
electronic version of this thesis. In addition to the standard libraries, the Binders 
Unbound library of Weirich et al. (2011b) is used to represent syntax with names 
and bindings, deriving a-equivalence and substitution functions automatically. 

In this appendix, I implement syntactic unification and Hindley-Milner type 
inference, as described in Chapter 2. Section A.l gives datatypes representing 
types, terms and contexts in the object language; Section A. 2 gives the implemen- 
tation of unification, and this is used in Section A. 3 to implement type inference. 
Finally, Section A. 4 contains an implementation of elaboration from Hindley- 
Milner terms into System F, based on a zipper. 

1 https : //github. com/adamgundry /type- inference/ 



A.l Representation of types and terms 



This section implements type, contexts and terms, as in Section 2.1 (page 11). 

The datatype Type represents types of the object language, which may contain 
metavariables M and variables V as well as functions and a base type. The Name 
constructor is provided by the Binders Unbound library. 

data Type = M (Name Type) | V (Name Type) | Type -> Type 

The fmv function computes the free metavariables of a type. 

fmv :: Type — > Set (Name Type) 

fmv (M a) = {a} 

fmv (V a) =0 

fmv (r — > v) = fmv r U fmv v 

The datatype Scheme represents type schemes. Binding variables uses a locally 
nameless representation where bound variables have de Bruijn indices and free 
variables (those bound in the context) have names (McBride and McKinna, 2004). 

data Scheme = T Type | All (Bind (Name Type) Scheme) 

Bwd is the type of backwards lists with • for the empty list and :< for snoc. 
Lists are traversable functors, and monoids under concatenation (©), in the usual 
way. Datatype declarations are cheap, so rather than reusing the forwards list 
type [], I prefer to make the code closer to the specification. 

data Bwd a — • | Bwd a :< a 

Contexts are backwards lists of entries, which are either metavariables E (pos- 
sibly carrying a definition), term variables Z or generalisation markers A con- 
text suffix contains only metavariable entries, and can be appended to a context 
with the 'fish' operator (<x). 

type Context = Bwd Entry 

type Suffix = [(Name Type, Decl Type)] 

data Decl v = HOLE | DEFN v 

data Entry = E (Name Type) (Decl Type) | Z (Name Tm) Scheme | , 

(<x) :: Context — > Suffix — > Context 

0 <X ((a, d) : es) = (<9 :< E a d) <X es 
O <X [] = 0 
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The Contextual monad represents computations that can mutate the context, 
generate fresh names and throw exceptions. It thus encapsulates the effects 
needed to implement unification and type inference. I will use the throwError 
operation in the monad to abort due to 'expected' errors, such as impossible uni- 
fication problems, and the Haskell built-in error for violations of invariants that 
would indicate bugs in the implementation itself. 

newtype Contextual a = Contextual 

(StateT Context (FreshMT (ErrorT String Identity)) a) 

The popL function removes and returns an entry from the metacontext. 

popL :: Contextual Entry 
popL = do 0 :< e «— get 

put 0 

return e 

The fresh Meta function generates a fresh metavariable name and appends a 
HOLE to the context. 

freshMeta :: String — > Contextual (Name Type) 
fresh Meta a = do a <— fresh (s2n a) 

modify (:<E a HOLE) 

return a 

The datatype Tm represents terms in the object language. As with type 
schemes, it uses a locally nameless representation. 

data Tm = X (Name Tm) | App Tm Tm 

| Lam (Bind (Name Tm) Tm) | Let Tm (Bind (Name Tm) Tm) 

The Contextual monad supports the find function, which looks up a term 
variable in the context and returns its scheme. 

find :: Name Tm — > Contextual Scheme 
find x = get >■= help 
where 

help :: Context — > Contextual Scheme 

help • = throwError $ "Out of scope: " -H- show x 

help (0 :< Z y a) | x = y = return a 
help ((9 :< _) = help 0 
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The inScope operator runs a Contextual computation with an additional term 
variable in scope, then removes the variable afterwards. 

inScope :: Name Tm — > Scheme — > Contextual a — > Contextual a 
inScope x a m = do modify (:<Z x a) 
a m 

modify dropVar 
return a 

where 

dropVar • = error "Invariant violation" 

dropVar (6 :< Z y _) | x = y = 0 

dropVar (0 :< e) = dropVar 0 :< e 

A. 2 Unification 

Having set up the necessary data structures, I will now implement the unification 
algorithm of Section 2.2 (page 19). 

The onTop operator delivers the typical access pattern for contexts, locally 
bringing the top variable declaration into focus and working over the remainder. 
The local operation /, passed as an argument, may restore the previous entry, or 
it may return a context extension (containing at least as much information as the 
entry that has been removed) with which to replace it. 

data Extension = Restore | Replace Suffix 

onTop :: (Name Type — > Decl Type — > Contextual Extension) 

— > Contextual () 
onTop / = popL ^= \ e — > case e of 
E a d — ^ / a d 3= \ m — > case m of 
Replace E — > modify (<X E) 
Restore — > modify (:<e) 
_ — > onTop / » modify (:<e) 

restore :: Contextual Extension 
restore = return Restore 

replace :: Suffix — > Contextual Extension 
replace = return o Replace 
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The unify function actually implements unification. This proceeds structurally 
over types. If it reaches a pair of metavariables, it examines the context, using 
onTop to pick out a declaration to consider. Depending on the metavariables, it 
then either succeeds, restoring the old entry or replacing it with a new one, or 
continues with an updated constraint. 

unify :: Type — > Type — > Contextual () 

unify (t 0 — > T\) (v 0 —> V\) = unify r 0 v 0 » unify T\ v 1 

unify (M a) (M ($) = onTop $ \7 d ->• case 
(7 = a, -f = /3,d ) of 
(True, True, _ ) — > restore 
(True, False, HOLE ) ->■ replace [(a, DEFN (M /?))] 
(False, True, HOLE ) ->• replace [(/3, DEFN (M a))] 
(True, False, DEFN r) -)• unify (M /3) r > restore 
(False, True, DEFN r) — >■ unify (M a) r » restore 
(False, False, _ ) ->• unify (M a) (M (3) > restore 

unify (M a) r = solve a [] r 

unify r (Ma) = solve a [] r 

unify _ _ = throw/Error "Rigid-rigid mismatch" 

The solve function is called to unify a metavariable with a rigid type (one that 
is not a metavariable). It works similarly to the way unify works on pairs of 
metavariables, but must also accumulate a list of the type's dependencies and 
push them left through the context. It performs the occurs check and throws an 
exception if an illegal occurrence (leading to an infinite type) is detected. 

solve :: Name Type — > Suffix — > Type — > Contextual () 
solve a £ r = onTop $ 
\ 7 d — > case 

(7 = a, 7 G fmv r, d ) of 

(_, _, DEFN v) -»• modify (<x S) 

» unify (subst 7 v (M a)) (subst 7 v r) 
» restore 

(True, True, HOLE ) — ^ throwError "Occurrence detected!" 
(True, False, HOLE ) ->■ replace ( H 0 [(a, DEFN r)]) 
(False, True, HOLE ) -»• solve a ((7, HOLE) : =") r 

» replace [ ) 
(False, False, HOLE ) — > solve a S r 

» restore 
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A. 3 Type inference 



Building on the implementation of unification in the previous section, I now 
implement the type inference algorithm described in Section 2.3 (page 23). 

The metaBind and metaUnbind functions extend the bind and unbind functions 
provided by the Binders Unbound library, so that binding a metavariable converts 
it into a variable, and vice versa. 

metaBind :: (Alpha Subst Type t) 

Name Type — > t — > Bind (Name Type) t 
metaBind a = bind a o subst a (V a) 

metaUnbind :: (Alpha t, Subst Type t, Fresh m) =>- 

Bind (Name Type) t — > m (Name Type, t) 

metaUnbind b = do (a, t) ^— unbind b 

return (a, subst a (M a) t) 

Specialisation of type schemes is implemented by the specialise function, which 
unpacks a scheme with fresh metavariables for the bound variables. 

specialise :: Scheme — > Contextual Type 

specialise (T r) = return r 

specialise (All b) = do (/?, a) ^— metaUnbind b 

modify (:<E (3 HOLE) 

specialise a 

Generalisation turns a type into a scheme by 'skimming' entries off the top of 
the metacontext. The generaliseOver control operator runs a Contextual computa- 
tion in a new locality (extending the context by 9), then generalises the resulting 
type until it finds the 9 again. This depends on the function which generalises 
a suffix of metavariables over a type to produce a scheme. 
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generaliseOver :: Contextual Type — > Contextual Scheme 
generaliseOver x = do modify (:<?) 



T <— X 



E <— skimContext [ I 
return (E ft r) 



where 

skimContext :: Suffix — > Contextual Suffix 
skimContext E = popL >■= \ e — > case e of 

E a d — >■ skimContext ((a, g?) : 

9 — > return S 

(ft) :: Suffix — >■ Type — >■ Scheme 



((a, HOLE) : ») ft r = All (metaBind a ( ~ ft r)) 
((a, DEFN u) : H) ft r = subst a u (E ft r) 

Finally, the infer function implements the type inference algorithm. It pro- 
ceeds structurally through the term, following the rules in Figure 2.9 (page 26) 
and using the monadic operations defined earlier. 

infer :: Tm — > Contextual Type 

infer (X x) = find x >■= specialise 

infer (Lam b) = do (x, t) <— unbind b 



[] 



ft T = T T 



infer (Let s b) 



infer (App / s) 



do 



do 



a <- M ($) fresh Meta "alpha" 
v <— inScope x (T a) $ infer £ 
return (a -> v) 

X <- infer/ 
f -f- infer s 

f3 <- M ($) freshMeta "beta" 
unify x (u -> /?) 
return /3 

a generaliseOver (infer s) 

(x, t) <c— unbind b 
inScope x a $ infer £ 
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A. 4 Elaboration, zipper style 



In this section, I implement the zipper-based elaboration algorithm described in 
Section 2.4 (page 27). This transforms source language terms Tm (defined in 
Section A.l) into System F terms FTm, represented thus: 

data FTm = VarF (Name FTm) | AppTm FTm FTm | AppTy FTm Type 
| LamTm Scheme (Bind (Name FTm) FTm) 
| LamTy (Bind (Name Type) FTm) 

As described in the text, context entries now consist of metavariables and layers: 

data TermLayer = AppLeft () Tm 

| AppRight (FTm, Type) () 

| LamBody (Name Tm,Type) () 

| LetBinding () (Bind (Name Tm) Tm) 

| LetBody (Name Tm) (FTm, Scheme) () 

data Entry = E (Name Type) (Decl Type) | L TermLayer 

Most functions from the previous sections, including the unification algorithm, 
remain unchanged. The find function, which looks up a term variable in the 
context and returns its type scheme, is easily adapted to the new structure: 

find :: Name Tm — > Contextual Scheme 
find x = get >■= help 
where 

help :: Context — > Contextual Scheme 
help • = throwError $ "Out of scope: " -H- show x 
help (<9 :< L (LamBody (y, t) ())) | x = y — return (T r) 
help (© :< L (LetBody y (_, a) ())) | x = y — return a 
help ((9 :< _) = help 6> 

The specialise function takes an elaborated term and its scheme, and applies the 
term to fresh metavariables to produce a witness of the specialised type. 

specialise :: FTm — > Scheme — > Contextual (FTm, Type) 

specialise t (T r) = return (t, r) 

specialise t (All b) = do (f3, a) ^— metaUnbind b 

modify (:<E (3 HOLE) 

specialise (t 'AppTy' M (3) o 



204 



Now elaboration can be implemented as a tail-recursive function elab. To 
elaborate a variable, it looks up the type scheme and instantiates it with fresh 
metavariables, then calls the next function to navigate the zipper structure and 
find the next elaboration problem. For A-abstractions, applications and let- 
bindings, it extends the zipper and elaborates the appropriate subterm. 

elab :: Tm — > Contextual (FTm,Type) 
elab (Xi) = do a <— find x 



elab (/ ^App a) = modify (:<L (AppLeft () a)) » elab / 
elab (Let s b) = modify (:<L (LetBinding () b)) » elab s 

The next function is called with the term at the current location and its type. 
It navigates through the zipper structure to find the next elaboration problem, 
updating the current term and type as it does so. The accumulator E collects 
metavariables that encountered along the way. These are reinserted into the 
context once the new problem is found, or if a LetBinding layer is encountered, E 
contains exactly the metavariables to generalise over. 

next :: Suffix — > (FTm,Type) — > Contextual (FTm,Type) 
next E (t, t) = optional popL >■= \ e — > case e of 
Just (L (AppLeft () a)) ->■ do modify (<X E) 



next [] =^c specialise (VarF x) a 



elab (Lam b) 



do (x, i) <— unbind b 

a <r- freshMeta "alpha" 

modify (:<L (LamBody (x, M a) ())) > elab t 



modify (:<L (AppRight (t,r) ())) 



elab a 



Just (L (AppRight (f 7 a) ())) 



->■ do modify (<X E) 

f3 <- M ($) freshMeta "beta 
unify a (r — > (3) 
next[](/ V\ppTm v *,/3) 



Just (L (LamBody (x,v) ())) 
Just (L (LetBinding () b)) 



— > next E (Xx:v. t,v —> r) 



— > do (x, w) ^— unbind b 

\et(t',a) = (AE.t,EtT) 
modify (:<L (LetBody x (t',a) ())) 



Just (L (LetBody x (s,a) ())) 

Just (E a d) 

Nothing 



elab w 

— > next E (\x : a. t x AppTm x s, r) 

— > next ((a, d) : E) (t, t) 

— > modify (<X E) » return (t, r) 
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Appendix B 



Reference implementation of 
units of measure 

This appendix extends the implementation of unification in Appendix A to sup- 
port the units of measure of Chapter 3. Section B.l introduces the data types 
representing units of measure in normal form, using the signed-multiset library 
of Holdermans (2013). Section B.2 extends the representation of types and con- 
texts from Section A.l to support the syntax of units. The implementation of 
unification for units of measure is given in Section B.3, and this is used to imple- 
ment type unification in Section B.4. There is no change to the implementation of 
type inference from Section A. 3, other than using the new unification algorithm. 

B.l Representation of units of measure 

I begin by introducting the semantic representation of units of measure, along 
with operations on them, as described in Section 3.1 (page 35). A unit of measure 
is represented as a Unit value with signed multisets of metavariables and constants. 
For simplicity, the type of base units is fixed. 

data Unit = Unit (SignedMultiset (Name Type)) (Signed Multiset BaseUnit) 
data BaseUnit = METRE | SEC | KG 

The mkUnit function creates a unit from lists of powers of metavariables and 
base units. As a special case, metaUnit creates a unit from a single metavariable. 

mkUnit :: [(Name Type, Int)] [(BaseUnit, Int)] Unit 
mkUnit vs bs = Unit (fromList vs) (fromList bs) 

metaUnit :: Name Type — > Unit 
metaUnit a = mkUnit [(a, 1)] 



Utility functions determine if a unit is the identity or constant, the number 
of variables it contains, and the power of a metavariable in it. 

isldentity :: Unit — > Bool 

isldentity (Unit vs bs) = null vs A null bs 

isConstant :: Unit — > Bool 
isConstant (Unit vs bs) = null vs 

numVariables :: Unit — > Int 
numVariables (Unit vs _) = size vs 

powerln :: Name Type — > Unit — > Int 
a N powerln N Unit vs _ = multiplicity a vs 

The dividesPowers function determines if an integer divides all the powers of 
metavariables and base units. 

dividesPowers :: Int — > Unit — > Bool 

n v dividesPowers x (Unit vs bs) = dividesAII vs A dividesAII bs 
where 

dividesAII :: SignedMultiset a — > Bool 
dividesAII = all ((0 =) . ('mod'n) . snd) . toList 

The notMax function determines if the power of a variable is less than the 
power of at least one other variable. 

notMax :: (Name Type, Int) ->■ Unit ->■ Bool 
notMax (a, n) (Unit vs _) = any bigger (toList vs) 
where bigger (/?, m) — a ^ (3 A abs n ^ abs m 

The (©), (0) and (©) operators respectively multiply and divide units, and 
raise a unit to a constant power. 

(©) :: Unit ->• Unit ->■ Unit 

Unit vs bs © Unit vs' bs' = Unit (additiveUnion vs vs') (additiveUnion bs bs) 

(0) :: Unit ->• Unit ->■ Unit 
d 0 e = d ® invert e 

(©) :: Unit ->• Int ->■ Unit 

Unit vs bs © fc = Unit (multiply fc ds) (multiply k bs) 
invert :: Unit — > Unit 

invert (Unit vs bs) = Unit (shadow vs) (shadow bs) 
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The pivot function removes the given metavariable from the unit, inverts it 
and takes the quotient of its powers by the power of the removed variable. 

pivot :: Name Type — > Unit — > Unit 
pivot a e = invert $ quotient $ e 0 (metaUnit a © n) 
where 
n = a x powerln N e 

quotient (Unit vs bs) = mkUnit (map (second ( ^quot^n)) (toList vs)) 

(map (second ( v quot v n)) (toList bs)) 

The substUnit function substitutes a unit for a metavariable in another unit. 

substUnit :: Name Type — > Unit — > Unit — > Unit 

substUnit a d e = ((d 0 metaUnit a) © (a ^powerln^ e)) © e 

B.2 Representation of types 

Now I extend the representation of types and contexts from Section A.l to include 
units of measure, as described in Subsection 3.0.2 (page 33). The datatype of 
types retains metavariables, variables and functions, and gains syntax for units 
(types of kind U): the identity, multiplication, constant expontentiation and base 
units. The Float constructor is an example of a type parameterised by a unit. 

data Kind =*\U 

data Type = M (Name Type) | V (Name Type) | Type -> Type 

Float Type | One | Type : * Type | Type : A Int | Base BaseUnit 

The set of free metavariables is computed in the obvious way. 

fmv :: Type — > Set (Name Type) 

fmv (M a) = {a} 

fmv (V a) =0 

fmv (r — > v) = fmv r U fmv v 

fmv (Float v) = fmv v 

fmv One = 0 

fmv {y = fmv v U fmv v' 

fmv {y : A _) = fmv v 

fmv (Base _) = 0 
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It is easy to convert a semantic Unit to a syntactic expression Type, while the 
other direction may fail if the type is not well-kinded. 

unitToType :: Unit — > Type 

unitToType (Unit xs ys) = foldr (\ a k r — > (M a : A k) : * r) One xs 

: * foldr (\u k r — >■ (Base u : A k) : * r) One ys 

typeToUnit :: Type — > Unit 

typeToUnit (M a) = metaUnit a 

typeToUnit One = mkUnit [] [] 

typeToUnit (u = typeToUnit v © typeToUnit v 1 

typeToUnit (u :Ak) = typeToUnit v © k 

typeToUnit (Base b) = mkUnit [] [(6, 1)] 

typeToUnit- = error "typeToUnit : kind error" 

Type schemes are defined as in Appendix A, except that each V quantifier 
carries a kind. 

data Scheme = T Type | All Kind (Bind (Name Type) Scheme) 

Similarly, contexts are generalised to record the kinds of metavariables: 

type Context = Bwd Entry 
type Suffix = [(Name Type, Kind, Decl Type)] 
data Entry = E (Name Type) Kind (Decl Type) 
Z (Name Tm) Scheme 

o 

I 9 

data Decl v = HOLE | DEFN v 

The type Tm of terms is unchanged from Appendix A. Likewise, the Contextual 
monad and popL, find and inScope operations use the new definition of Context 
but are otherwise identical. The fresh Meta operation is parameterised over the 
kind of the metavariable to create: 

freshMeta :: String — > Kind — > Contextual (Name Type) 
fresh Meta a k = do a ^— fresh (s2n a) 

modify (:<E a k HOLE) 

return a 
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The unification algorithm must searching the context for metavariable decla- 
rations (perhaps of a particular kind), make some changes and either choose to 
restore the existing declaration or replace it with a new one. As before, the onTop 
function captures this pattern, and it is used to implement onTop* and onTop w 
that look for a metavariable of the corresponding kind. 

data Extension = Restore | Replace Suffix 

restore :: Contextual Extension 
restore = return Restore 

replace :: Suffix — > Contextual Extension 
replace = return o Replace 

onTop :: (Name Type — > Kind — > Decl Type — > Contextual Extension) — > 

Contextual () 
onTop / = popL >■= \ e — > case e of 

Eand-^-fand S= \ m — > case m of 

Replace E — > modify (<X E) 
Restore — > modify (:<e) 
— >■ onTop /» modify (:<e) 

onTop* :: (Name Type — > Decl Type — > Contextual Extension) — > 

Contextual () 
onTop* / = onTop $ \ a k d — > case k of 

^ — > f a d 

U — > onTop* / » restore 

onTop^ :: (Name Type — > Decl Type — > Contextual Extension) — > 

Contextual () 
onTop^ / = onTop $\«Kd4 case k of 
U ->• / a d 

-k — > onTop^ / » restore 



B.3 Unification of unit expressions 

I now implement the abelian group unification algorithm given in Section 3.1 
(page 35). This is based around an algorithm for unifying single expressions with 
the group identity. A pair of expressions can then be unified thus: 

unifyUnit :: Type — > Type — > Contextual () 

unifyUnit d e = unifyld Nothing $ typeToUnit d 0 typeToUnit e 
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To unify a unit expression e with the identity, first check if it is already the identity 
(and win) or is another constant (and lose). Otherwise, search the context for 
group variables that occur in e. When one is found, either substitute it into the 
expression (if it has a definition) or examine the coefficients to determine how to 
proceed. If its coefficient n divides all the others, it can be defined to solve the 
equation. Otherwise, either reduce the coefficients modulo n or just collect the 
variable and move it back in the context. 

unifyld :: Maybe (Name Type) — > Unit — > Contextual () 
unify Id & e 

| isldentity e = return () 

| isConstant e = throwError "Unit mismatch! " 
| otherwise = onTop^ $\arf4 

let n — a x powerln N e in 

case d of 

_ | n = 0 — > do unifyld W e 

restore 

DEFN x ->• do modify (ins 9) 

let e' = substUnit a (typeToUnit x) e 

unifyld Nothing e' 

restore 

HOLE 

| n ^dividesPowers^ e — > do modify (ins \P) 

let p = pivot a e 

replace [(a, U, DEFN (unitToType p))] 

| (a, n) ^notlvW e — > do modify (ins 

/3 <- fresh (s2n "beta") 
let p = pivot a e® metaUnit (5 
unifyld (Just (5) $ substUnit ape 
replace [(a, U, DEFN (unitToType p))] 

| numVariables e > 1 — > do unifyld (Just a) e 

replace [] 

| otherwise — > throwError "No way! " 

ins :: Maybe (Name Type) — > Context — > Context 

ins Nothing 6=0 

ins (Just a) 0 = 0 :< E a U HOLE 



211 



B.4 Unification of types 



— > restore 

->■ replace [(a,*, DEFN (M /?))] 
-)• replace [(£,*, DEFN (M a))] 
— >■ unify (M /3) r » restore 
— > unify (M a) r » restore 
->• unify (M a) (M /3) > restore 



Here I implement the type unification algorithm given in Section 3.2 (page 39). 
The implementation of unify for types with units of measure is very similar to the 
version in Section A. 2, except that it calls unifyUnit to unify the unit annotations 
of Float types, and uses startSolve in place of solve as discussed below. 

unify :: Type — > Type — > Contextual () 

unify (t 0 — > T\) (v 0 —> v\) = unify r 0 v 0 » unify T\ V\ 

unify (Float d) (Float e) = unifyUnit d e 

unify (M a) (M f3) = onTop* $ \ 7 d ->■ case 

(7 = a, 7 = 0, d) of 

(True, True, _ 

(True, False, HOLE 

(False, True, HOLE 

(True, False, DEFN r) 

(False, True, DEFN r) 

(False, False, _ 
unify (M a) r = startSolve a r 

unify r (Ma) = startSolve a r 

unify _ _ = throw/Error "Rigid-rigid mismatch" 

When starting to solve a flex-rigid constraint, one has to be careful not to ac- 
cidentally lose polymorphism, as explained in Subsection 3.2.1 (page 40). The 
syntactic occurs check performed by solve is not quite right, because the richer 
equational theory of abelian groups may exhibit apparent dependency when there 
is in fact none. Thus startSolve replaces units in the rigid type with fresh variables, 
solves the flex-rigid constraint first, then unifies the units. 

startSolve :: Name Type — > Type — > Contextual () 
startSolve a t = do (p, xs) ^— rigidHull r 

solve a (constraintsToSuffix xs) p 

solveConstraints xs 

The rigidHull operation computes the 'hull' of a type of kind *, replacing unit 
subexpressions with fresh variables. Along with the hull, it returns the constraints 
between the fresh variables and the units they replaced. 
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rigid Hull :: Type — > Contextual (Type, [(Name Type, Type)]) 

rigidHull (M a) = return (M a, []) 

rigidHull (V a) = return (V a, []) 

rigidHull (r -> v) = do (r', xs) <— rigidHull r 

(v'i ys) <- rigidHull v 

return (t' -> -u', xs © ys) 
rigidHull (Float d) = do /3 fresh (s2n "beta") 

return (Float (M (3), [{(3, d)}) 

A list of constraints can be turned into the appropriate context suffix by discard- 
ing the types and adding unit declarations for the metavariables: 

constraintsToSuffix :: [(Name Type, Type)] — > Suffix 
constraints ToSuffix = map (\ (a, _) — > (a,U, HOLE)) 

Or they can be solved by repeatedly invoking unifyUnit: 

solveConstraints :: [(Name Type, Type)] — > Contextual () 
solveConstraints = mapM_ (uncurry $ unifyUnit o M) 

The implementation of solve is almost identical to the version in Appendix A. 

solve :: Name Type — > Suffix — > Type — > Contextual () 
solve a S t = onTop* $ 
\ 7 d — > case 

(7 = a, 7 G fmv r, d ) of 

(_, _, DEFN v) modify (<x =") 

» unify (subst 7 v (M a)) (subst 7 v t) 
» restore 

(True, True, HOLE ) — >■ throwError "Occurrence detected!" 
(True, False, HOLE ) — > replace $50 [(a, DEFN r)] 
(False, True, HOLE ) solve a ((7,*, HOLE) : E) t 

» replace [ 
(False, False, HOLE ) — > solve a E r 

» restore 
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Appendix C 



Reference implementation of 
Miller pattern unification 

Having specified the pattern unification algorithm in Chapter 4, I now implement 
it in Haskell. The code is organised along similar lines to the previous two appen- 
dices, although the details differ substantially. First I describe the representation 
of object language terms (Section C.I) and the domain-specific language in which 
I will implement the algorithm (Section C.2). I then give implementations of type 
and equality checking (Section C.3), and unification (Section C.4). 

C.l Representation of terms 

First I define terms and machinery for working with them (including evaluation 
and occurrence checking), based on the description in Subsection 4.1.1 (page 53). 

Object language terms are represented using the data type Tm. The Binders 
Unbound library of Weirich et al. (2011b) defines the Bind type constructor 
and gives a cheap locally nameless representation with operations including a- 
equivalence and substitution for first-order datatypes containing terms. I use a 
single constructor for all the canonical forms (that do not involve binding) so as 
to factor out common patterns in the typechecker. 

data Tm where 

A :: Bind Norn Tm — > Tm 

:: Head ->■ Bwd Elim ->■ Tm 
C "CanTnwTm 
n, X :: Type — > Bind Norn Type — > Tm 

type Norn = Name Tm 



data Can t = Set | Type | Pair t t | Bool | Tt | Ff ] N | Ze | Su t 
data Head = V Nom Twin | M Nom 
data Twin = Only | TwinL | TwinR 

data Elim = A Tm | Hd | Tl | If (Bind Nom Type) Tm Tm 
type Type = Tm 

The non-binding canonical forms Can induce a Foldable functor (which can be 
derived automatically by GHC). Annoyingly, Elim cannot be made a functor in 
the same way, because Bind Nom is not a functor on * but only on the subcategory 
induced by Alpha. However, the action on morphisms can be defined thus: 

mapElim :: (Tm — > Tm) — > Elim — > Elim 
mapElim /(A a) = A (/ a) 
mapElim _ Hd = Hd 

mapElim _ Tl = Tl 

mapElim / (If T s t) = If (bind x (/ T')) (/ s) (/ t) 
where (x, T') = unsafeUnbind T 

foldMapElim :: Monoid m (Tm — > m) — > Elim — > m 
foldMapElim / (A a) =fa 
foldMapElim Hd = mempty 

foldMapElim _ Tl = mempty 

foldMapElim / (If T s t) = f V 0 / s 0 / t 
where (_, T') = unsafeUnbind T 

Despite the single-constructor representation of canonical forms, it is often 
neater to write code as if Tm had a data constructor for each canonical constructor 
of the object language. This is possible thanks to pattern synonyms (Aitken and 
Reppy, 1992) as implemented by the Strathclyde Haskell Enhancement (McBride, 
2010b). Pattern synonyms are abbreviations that can be used 'on the left' (in 
patterns) as well as 'on the right' (in expressions). 



pattern Type 


= C 


Type 


pattern Set 


= c 


Set 


pattern pair s t 


= c 


(Pair s t) 


pattern B 


= c 


Bool 


pattern tt 


= c 


Tt 


pattern ff 


= c 


Ff 


pattern N 


= c 


N 


pattern ze 


= c 


Ze 


pattern su n 


= c 


(Su n) 
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Free variables 

Rather than definining functions to determine the free metavariables and variables 
of terms directly, I use a typeclass to make them available on the whole syntax. 

data Flavour = Vars | RigVars | Metas 

class Occurs t where 

free :: Flavour — > t — > Set Nom 

fv, fv rig ,fmv :: Occurs t =>• t ->■ Set Nom 
fv = free Vars 
fv ri § = free RigVars 
fmv = free Metas 

instance Occurs Tm where 
free I (A b) = free I b 
free I (C c) = free I c 
free I (U S T) — free I S U free I T 
free I (E S T) = free I S U free I T 

free RigVars (V x _ • e) = {x} U free RigVars e 

free RigVars (M _• _) =0 

free / (h ■ e) = free I h U free I e 

instance Occurs t =^ Occurs (Can t) where 
free I (Pair s t) = free I s U free / t 
free / (Sun) = free I n 
free / _ =0 

instance Occurs Head where 
free Vars (M _) =0 
free RigVars (M _) =0 
free Metas (M a) = {a} 
free Vars (V x _) = {x} 
free RigVars (V x _) = {x} 
free Metas (V _ _) = 0 

instance Occurs Elim where 
free I (A a) = free I a 
free Z Hd =0 
free IT\ =0 

free J (If T s t) = free I T U free / s U free / * 
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Evaluation by hereditary substitution 

Substitutions are implemented as finite maps from names to terms; as a technical 
convenience there is no distinction between substitution and metasubstitution. 

type Subs = [(Norn, Tm)] 

(o) :: Subs — > Subs — > Subs 

5' o 5 = unionBy ((=) 'on' fst) 5' (substs 5' 5) 

The evaluator is an implementation of hereditary substitution defined in Fig- 
ure 4.2 (page 54): it proceeds structurally through terms, replacing variables 
with their values and eliminating redexes using the ( %% ) operator defined below. 

eval :: Subs — > Tm — > Tm 

eval g (A b) = A (evalUnder g b) 

eval g (h- e) = foldl (%%) (evalHead g h) (fmap (mapElim (eval g)) e) 

eval g (C c) = C (fmap (eval g) c) 

eval g(JlS T) = U (eval g S) (evalUnder g T) 

eval j(SST) = S (eval g S) (evalUnder g T) 

evalHead :: Subs — > Head — > Tm 
evalHead g (V x _) | Just t ^— lookup x g = t 
evalHead g (M a) | Just t ^— lookup a g = t 
evalHead g h = h ■ • 

evalUnder :: Subs — > Bind Norn Tm — > Bind Norn Tm 
evalUnder g b = bind x (eval g t) 
where (x, t) = unsafeUnbind b 

The ( %% ) operator reduces a redex (a term with an eliminator) to normal form: 
this re- invokes hereditary substitution when a A-abstraction meets an application. 

(%%) :: Tm Elim Tm 

A b %% (A a) = eval [(x, a)] t where (x, t) = unsafeUnbind b 
pair x _%% Hd = x 
pair _ y %% Tl = y 
it %% If _ t _ = t 



fF 



%% If 



/ = / 



h ■ e 



%%z 



h ■ (e :< z) 



t 



%% a 



= error "bad elimination 



ii 
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I define some convenient abbreviations: ( $$ ) for applying a function to an 
argument, ($*$) for applying a function to a telescope of arguments, •{•} for 
substituting out a single binding and hd and tl for the projections from E-types. 

( $$ ) :: Tm — > Tm — > Tm 

/ $$ a = f %% A a 

( $*$ ) :: Tm — > Bwd (Norn, Type) — > Tm 
f$*$T = foldl ( $$ ) / (fmap (var . fst) r) 

■ {•} :: Bind Norn Tm ->■ Tm ->■ Tm 

f{s} = Xf$$s 

hd, tl :: Tm — > Tm 

hd = (%%Hd) 
tl = (%%TI) 



C.2 Problems and contexts 

I will now define unification problems, metacontexts and operations for working 
on them in the Contextual monad. The notions of metacontext and context in use 
were given in Subsection 4.1.2 (page 55), and the monadic approach develops that 
of the previous appendices. Metacontext entries now consist of metavariables, as 
before, or problems, which carry a status bit used to record whether they have 
been solve as far as possible given their current type (see Subsection C.4.6). 
Problems are equations under universally quantified parameters, and parameters 
may include twins. 

data Decl v = HOLE | DEFN v 

data Entry = E (Name Tm) (Type, Decl Tm) | Q Status Problem 
data Status = Blocked | Active 

data Param = P Type | TypefType 
type Params = Bwd (Norn, Param) 

data Equation = (Tm : Type) ps (Tm : Type) 

data Problem = Unify Equation | All Param (Bind Norn Problem) 

The sym function swaps the two sides of an equation: 

sym :: Equation — > Equation 

sym ((a T)) = (t : T) w (s : S) 
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The metacontext is represented as a list zipper: a pair of lists representing 
the entries before and after the cursor. Entries after the cursor may include 
substitutions, being propagated lazily. 

type ContextL = Bwd Entry 

type ContextR = [Either Subs Entry] 

type Context = (ContextL, ContextR) 

The Contextual monad stores the current context and parameters, generates 
fresh names when required for going under binders, and handles exceptions. 

newtype Contextual a = Contextual 

(ReaderT Params (StateT Context (FreshMT (ErrorT String Identity))) a) 

Reading and modifying state 

I define versions of the usual state-manipulating get, modify and put operations 
that act on the left or right part of the context (before or after the cursor). 

getL :: Contextual ContextL 
getL = gets fst 

getR :: Contextual ContextR 
getR = gets snd 

modifyL :: (ContextL — > ContextL) — > Contextual () 
modifyL = modify o first 

modifyR :: (ContextR — > ContextR) — > Contextual () 
modifyR = modify o second 

putL :: ContextL — > Contextual () 
putL = modifyL o const 

putR :: ContextR — > Contextual () 
putR = modifyR o const 

Here are operations to push to, or pop from, either side of the cursor, or move 
the cursor one entry to the left: 

pushL :: Entry — > Contextual () 
pushL e = modifyL (:<e) 

pushR :: Either Subs Entry — > Contextual () 
pushR e = modifyR (e:) 
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pushLs :: Traversable / =>• / Entry — > Contextual () 
pushLs es = traverse pushL es » return () 

popL :: Contextual Entry 
popL = do 0 getL 

case 0 of (0' :< e) ->■ putL <9' > return e 

• — > throwError "popL : out of context" 

popR :: Contextual (Either Subs Entry) 
popR = do 0 <- getR 

case 0 of (x : <9') ->■ putR <9' > return x 

i — >■ throwError "popR: out of context" 

goLeft :: Contextual () 

goLeft = popL >■= pushR o Right 

Variable and metavariable lookup 

The context of local parameters is tracked using the ReaderT monad transformer, 
so the local operation can be used to bring a parameter into scope, and the ask 
operation can be used to look up a variable. 

inScope :: Norn — > Param — > Contextual a — > Contextual a 
inScope x p = local (:<(x,p)) 

lookupVar :: Nom — > Twin — > Contextual Type 
lookupVar x w = help w ask 
where 

help Only (T :< (y, P T)) | x = y = return T 

helpTwinL (r :< (y,S%T)) \ x = y = return S 

helpTwinR (r :< (y,S$T)) \ x = y = return T 

help w (r-.<J) = help w f 

help _ • = throwError $ "lookupVar: missing " -H- show x 

The type of a metavariable can be determined from its name by searching the 
metacontext. Only metavariables left of the cursor are in scope. 

lookupMeta :: Nom — > Contextual Type 
lookupMeta x = look getL 
where 

look (6>:< E y (T,_)) | x = y = return T 

look(0:<_) =\ook0 

look • = error $ "lookupMeta: missing " -ff show x 
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C.3 Type and equality checking 



Here I give a typechecker and definitional equality test for the type theory defined 
in Subsection 4.1.3 (page 56). With the Contextual monad operations, I define 
a bidirectional typechecker, based on a typed definitional equality test between 
/^-normal forms that produces an 77-long standard form. The equalise T s t 
function implements the judgment 6 | T h T 9 s ^ k ^ t, defined in Figure 4.4 
(page 59), where u is the result. 

equalise :: Type — > Tm — > Tm — > Contextual Tm 
equalise Type Set Set = return Set 
equalise Type S T = equalise Set S T 
equalise Set 11= return 1 
equalise 1 tt tt = return tt 
equalise 1 ff ff = return ff 
equalise Set (IT A B) (U S T) — do 

U <- equalise Set A S 

IT U ($} bindslnScope U B T 

(\x B' V -»• equalise Set B' V) 
equalise (U U V) f g = 

A ($) bindlnScope U V 

(\x V — > equalise V (/ $$ var x) (g$$ var x)) 
equalise Set (E A B) (E S T) = do 

U equalise Set A S 

E U ($} bindslnScope U B T 

(\xB' V ->■ equalise Set B' T') 
equalise (E U V) s t = do 

uo ^— equalise U (hd s) (hd t) 

ui <- equalise (V{uo}) (tl s) (tl t) 

return (pair u\) 
equalise U (h • e) (h' • e') = do 

(h", e", V) <- equaliseN h e h! e' 

equalise Type U V 

return (h" ■ e") 

Similarly, the equaliseN h e h' e' function implements the equality judgment 
9 I T h h ■ e ^ h" ■ e" ^ h' ■ e' e T, defined in Figure 4.5 (page 60), where h", e" 
and T are the results. 
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equaliseN :: Head — > Bwd Elim — y Head — > Bwd Elim — > 

Contextual (Head, Bwd Elim, Type) 
equaliseN h • hi • \ h = hi — (h, • , ) ($) infer h 

equaliseN h (e :< A s) ft' (e' :< A t) = do 

(/i", e",n [/ F) «- equaliseN kft'e' 

u equalise (7 s t 

return (h", e" :< A u, V{u}) 
equaliseN h (e :< Hd) hi (e' :< Hd) = do 

(h", e", £ U V) <- equaliseN h e ti e' 

return (h" , e" :< Hd, U) 
equaliseN h (e :< TI) /i' (e' :< TI) = do 

(h", e", H U V) <— equaliseN h e hi e' 

return (h", e" :< TI, V{h" ■ (e" :< Hd)}) 
equaliseN fe(e:<lf Tu v) hi (e' :< If T u' v') = do 

(h", e",B) equaliseN fteAV 

[/" bindslnScope BTf [/'—>■ equalise Type U U') 

u" <- equalise (U"{it}) u u' 

v" <- equalise (U"{«}) vv' 

return (h", e" :< If U" u" v", U" {hi' ■ e"}) 

The infer function looks up the type of a head, using lookupVar or lookupMeta 
from the previous section as appropriate. 

infer :: Head — > Contextual Type 
infer (V x w) — lookupVar x w 
infer (M x) = lookupMeta x 

The bindlnScope and bindslnScope helper operations introduce a binding or two 
and call the continuation with a variable of the given type in scope. 

bindlnScope :: Type — > Bind Norn Tm — > 

(Norn — > Tm — > Contextual Tm) — > 

Contextual (Bind Norn Tm) 
bindlnScope T b f = do (x, b') unbind b 

bind x ($) inScope x (P T) (/ x b') 

bindslnScope :: Type — > Bind Norn Tm — >■ Bind Norn Tm — > 
(Norn — > Tm — > Tm — > Contextual Tm) — > 
Contextual (Bind Norn Tm) 

bindslnScope T a b f = do Just (x, a', _, b') ^— unbind2 a b 

bind x ($) inScope x (P T) (f x a' b') 
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Equality checking can return a Boolean instead of throwing an error when the 
terms are not equal. Since typing is the diagonal of equality, it is easy to define 
a typechecking function as well. 

equal :: Type — > Tm — > Tm — > Contextual Bool 

equal T s t = (equalise T s t » return True) © (return False) 

typecheck :: Type — > Tm — > Contextual Bool 
typecheck T t — equal T t t 

Finally, a convenience function that tests if a heterogeneous equation is re- 
flexive, by checking that the types are equal and the terms are equal. 

isReflexive :: Equation — > Contextual Bool 

isReflexive ((s : S) « (t : T)) = optional (equalise Type S T) »= 

maybe (return False) (\ U — > equal U s t) 

C.4 Unification 

With the preliminaries out of the way, I can now present the pattern unification 
algorithm as specified in Section 4.2 (page 67). I begin with utilities for working 
with metavariables and problems, then give the implementations of inversion, 
intersection, pruning, metavariable simplification and problem simplification. Fi- 
nally, I show how the order of constraint solving is managed. 

Making and filling holes 

A telescope is a list of binding names and their types. Any type can be viewed 
as consisting of a Il-bound telescope followed by a non-Il-type. 

type Telescope = Bwd (Norn, Type) 

telescope :: Type — > Contextual (Telescope, Type) 
telescope (n S T) = do (x, T') <- unbind T 

(A, U) <- telescope V 

return ((• :< (x, S)) 0 A, U) 
telescope T = return (•, T) 

The hole control operator creates a metavariable of the given type (under a tele- 
scope of parameters), and calls the continuation with the metavariable in scope. 
Finally, it moves the cursor back to the left of the metavariable, so it will be 
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examined again in case further progress can be made on it. The continuation 
must not move the cursor. 

hole :: Telescope — > Type — > (Tm — > Contextual a) — > Contextual a 
hole r T f = do a <- fresh (s2n "alpha") 

pushi_$Ea (nr. T, HOLE) 

r <— / (meta a $*$ r) 
go Left 
return r 

Once a solution for a metavariable is found, the define function adds a defi- 
nition to the context. (The declaration of the metavariable should already have 
been removed.) This also propagates a substitution that replaces the metavari- 
able with its value. 

define :: Telescope — > Norn — > Type — > Tm — > Contextual () 
define r a S v = do pushR $ Left [{a, t)} 

pushR $ Right $Ea(T, DEFN t) 
where T = Iir. S 

t =xr.v 

Postponing problems 

When a problem cannot be solved immediately, it can be postponed by adding 
it to the metacontext. The postpone functions wraps a problem in the current 
context (as returned by ask) and stores it in the metacontext with the given 
status. The active function postpones a problem on which progress can be made, 
while the block function postpones a problem that cannot make progress until its 
type becomes more informative, as discussed in Subsection C.4.6. 

postpone :: Status — > Problem — > Contextual () 
postpone s p = pushR o Right oQso wrapProb p ask 
where 

wrapProb :: Problem — > Params — > Problem 
wrapProb = foldr (\ (x, e) p — > All e (bind x p)) 

active, block :: Problem — > Contextual () 
active = postpone Active 
block = postpone Blocked 
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A useful combinator 

The following combinator executes its first argument, and if this returns False 
then it also executes its second argument. 

(©) :: Monad m =>- m Bool — y m () — > m () 
a © b = do x <r- a 

unless x b 

C.4.1 Inversion 

A flexible unification problem is one where one side is an applied metavariable 
and the other is an arbitrary term. The algorithm moves left in the context, 
accumulating a list of metavariables S that the term depends on, to construct 
the necessary dependency-respecting permutation. Once the target metavariable 
is reached, it can attempt to find a solution by inversion. This implements step 
(4.16) in Figure 4.15 (page 80), as described in Subsection 4.2.1 (page 67). 

flexTerm :: [Entry] — > Equation — > Contextual () 
flexTerm E q@(M a ■ _ « _) = do 
r <— fmap snd ($) ask 
popL >■= \ e — > case e of 
E (3 (T, HOLE) 

| a = (3 A a e fmv E — > do pushLs (e : _ Xi) 

block (Unify q) 
| a = f3 — > do pushLs E 

trylnvert q T 

© (block (Unify q) > pushL e) 
| (5 G fmv (r, S", <j) — >■ flexTerm (e : E) q 
->■ do pushR (Right e) 
flexTerm S g 

A flex-flex unification problem is one where both sides are applied metavari- 
ables. As in the general case above, the algorithm proceeds leftwards through the 
context, looking for one of the metavariables so it can try to solve one with the 
other. If it reaches one of the metavariables and cannot solve for the metavariable 
by inversion, it continues (using flexTerm), which ensures it will terminate after 
trying to solve for both. For example, consider the case at;' ~ fix]! where only 
HTji is a list of variables. If it reaches a first then it might get stuck even if it 
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could potentially solve for (3. This would be correct if order were important in 
the metacontext, for example when implementing let-generalisation as discussed 
in Chapter 2. Here it is not, so the algorithm can simply pick up a and carry on. 

flexFlex :: [Entry] — > Equation — > Contextual () 
flexFlex E q@(M a ■ ds « M (3 ■ es) = do 
r <— fmap snd ($) ask 
popL >= \ e — > case e of 
E 7 (T, HOLE) 

| 7 G [a, (3} A 7 G fmv (E) — > do pushLs (e : E) 

block (Unify q) 
| 7 = a — ^ do pushLs S 

trylnvert <? T © flexTerm [e] (sym g) 
| 7 = 0 — ^ do pushLs s 

trylnvert (sym q) T © flexTerm [e] q 
| 7 G fmv (r, g) — > flexFlex (e : H) g 

->■ do pushR (Right e) 
flexFlex s g 

Given a flexible equation whose head metavariable has just been found in 
the context, the trylnvert control operator calls invert to seek a solution to the 
equation. If it finds one, it defines the metavariable. 

trylnvert :: Equation — > Type — > Contextual Bool 
trylnvert q@(M a ■ e m s) T = invert a T e s >■= \ m — > case m of 
Nothing — > return False 
Just v — > do active (Unify q) 
define • a T v 
return True 

Given a metavariable a of type T, spine e and term t, invert attempts to find 
a value for a that solves the equation a ■ e ~ t. It will throw an error if the 
problem is unsolvable due to an impossible occurrence. 

invert :: Norn — > Type — > Bwd Elim — > Tm — > Contextual (Maybe Tm) 
invert a T e t \ occurCheck True at — throwError "occur check" 
| a ^ fmv t, Just xs ^— toVars e, linearOn t xs = do 
b ^— local (const •) (typecheck T (Xxs. t)) 
return $ if b then Just (Xxs. t) else Nothing 
| otherwise = return Nothing 
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Note that the solution Xxs. t is typechecked under no parameters, so typechecking 
will fail if an out-of-scope variable is used. 

The occur check, used to tell if an equation is definitely unsolvable, looks for 
occurrences of a metavariable inside a term. In a strong rigid context (where the 
first argument is True), any occurrence is fatal. In a weak rigid context (where it 
is False), the evaluation context of the metavariable must be a list of variables. 

occurCheck :: Bool — > Nom — > Tm — > Bool 
occurCheck w a (X b) = occurCheck w at 

where (_, t) = unsafeUnbind b 
occurCheck w a (V • e) = getAny $ foldMap 

(foldMapElim (Any o occurCheck False a)) e 
occurCheck w a (M /3 • e) = a = (3 /\ (w V isJust (toVars e)) 
occurCheck w a (C c) = getAny $ foldMap (Any o occurCheck w a) c 
occurCheck w a (U S T) = occurCheck w a S V occurCheck w a V 

where (_, T) = unsafeUnbind T 
occurCheck w a (E S T) = occurCheck w a S V occurCheck w a V 

where (_, V) = unsafeUnbind T 

Here toVars tries to convert a spine to a list of variables, and linearOn determines 
if a list of variables is linear on the free variables of a term. Since it is enough 
for a term in a spine to be 77-convertible to a variable, the etaContract function 
implements ^-contraction for terms. 

linearOn :: Tm — > Bwd Nom — > Bool 
linearOn • = True 

linearOn t (as :< a) = -1 (a e fv t A a E as) A linearOn t as 



etaContract :: Tm — > Tm 

etaContract (A b) = case etaContract t of 

x • (e :< A (V y' _ • •)) | y = y', -1 (y e fv e) — >■ x • e 
t' ->■ Xy. t' 

where (y, t) = unsafeUnbind b 
etaContract (x ■ as) = x ■ (fmap (mapElim etaContract) as) 
etaContract (pair s t) = case (etaContract s, etaContract t) of 
(x ■ (as :< Hd), y ■ (bs :< Tl)) | x = y, as = bs — > x ■ as 
(s', t') -»• pair s' t' 

etaContract (C c) = C (fmap etaContract c) 
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toVar :: Tm — > Maybe Norn 

toVar v = case etaContract v of V x _ • • — > Just x 

->■ Nothing 

toVars :: Traversable f f Elim — > Maybe (/ Nom) 
toVars = traverse (unA >=> toVar) 
where unA (A t) = Just t 
unA _ = Nothing 

C.4.2 Intersection 

When a flex-flex equation has the same metavariable on both sides, i.e. it has 
the form ax~i l ~ ay~i l where xi l and lfi l are both lists of variables, the equation 
can be solved by restricting a to the arguments on which Tl 1 and yl l agree (i.e. 
creating a new metavariable (3 and using it to solve a). This implements step 
(4.18) in Figure 4.15 (page 80), as described in Subsection 4.2.2 (page 70). 

The flexFlexSame function extracts the type of a as a telescope and calls 
intersect to generate a restricted telescope. If this succeeds, it calls instantiate to 
create a new metavariable and solve the old one. Otherwise, it leaves the equation 
in the context. Twin annotations can be ignored here here because any twins will 
have definitionally equal types anyway. 

flexFlexSame :: Equation — > Contextual () 
flexFlexSame q@(M a ■ e M a ■ e') = do 
(A, T) <- telescope lookupMeta a 
case intersect A e e' of 
Just A' | fv T C vars A' ->• instantiate (a, UA'. T,\(3 -> XA. (3 $*$ A) 

->■ block (Unify q) 

Given a telescope and the two evaluation contexts, intersect checks the evaluation 
contexts are lists of variables and produces the telescope on which they agree. 

intersect :: Telescope — > Bwd Elim — > Bwd Elim — > Maybe Telescope 
intersect • • • = return • 

intersect (A :< (z, S)) (e :< A s) (e' :< A t) = do 
A' ^— intersect A e e' 
x ^— toVar s 
y ^— toVar t 

if x = y then return (A' :< (z, S)) else return A' 
intersect = Nothing 
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C.4.3 Pruning 

Given a flex-rigid or flex-flex equation, it might be possible to make some progress 
by pruning the metavariables contained within it, as described in Subsection 4.2.3 
(page 71). The tryPrune function calls pruneTm: if it learns anything from prun- 
ing, it leaves the current problem active and instantiates the pruned metavariable. 

tryPrune :: Equation — > Contextual Bool 

tryPrune q@(M a ■ e t) = pruneTm (fv e) t :»= \ u — > case u of 
d : _ — > active (Unify q) » instantiate d » return True 
i — > return False 

Pruning a term requires traversing it looking for occurrences of forbidden 
variables. If any occur rigidly, the corresponding constraint is impossible. If 
a metavariable is encountered, it cannot depend on any arguments that contain 
rigid occurrences of forbidden variables, so it can be replaced by a fresh metavari- 
able of restricted type. The pruneTm function generates a list of triples ((3, U,f) 
where (3 is a metavariable, U is a type for a new metavariable 7 and / 7 is 
a solution for (3. It maintains the invariant that U and / 7 depend only on 
metavariables defined prior to (3 in the context. 

pruneTm :: Set Norn — > Tm — > Contextual [Instantiation] 
pruneTm V (II S T) = (-H-) ($) pruneTm V S (*) pruneUnder V T 
pruneTm V (E S T) = (-H-) ($) pruneTm V S (*) pruneUnder V T 
pruneTm V (pair s t) = (-H-) ($) pruneTm V s (*) pruneTm V £ 
pruneTm V (X b) = pruneUnder V & 
pruneTm V (M /3 • e) = pruneMeta V (3 e 
pruneTm V (C _) = return [ 
pruneTm V (V z _ • e) | z G V = pruneElims V e 

I otherwise = throwError "pruning error" 

pruneUnder :: Set Norn — > Bind Norn Tm — > Contextual [Instantiation] 
pruneUnder V b = do (x, t) ^— unbind b 

pruneTm (V U {x}) t 

pruneElims :: Set Norn — > Bwd Elim — > Contextual [Instantiation] 
pruneElims V e = fold ($) traverse pruneElim e 
where 

pruneElim (A a) = pruneTm V a 

pruneElim (If T s t) = (-H-) ($) ((-H-) ($) pruneTm V s (*) pruneTm V £) 

(*) pruneUnder V T 

pruneElim _ = return [ 
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Once a metavariable has been found, pruneMeta unfolds its type as a telescope 
ilZ\. T, and calls prune with the telescope and list of arguments. If the telescope 
is successfully pruned (A' is not the same as A) and the free variables of T remain 
in the telescope, then an instantiation of the metavariable is generated. 

pruneMeta :: Set Norn — > Nom — > Bwd Elim — > Contextual [Instantiation] 
pruneMeta V (3 e = do 

(A, T) i- telescope =C lookupMeta (3 
case prune V A e of 
Just A' | A'^A,fv Tc vars A' 

->■ return [ UA'. T, \ beta' ->• XA. beta' $*$ A') ] 
— >■ return [ 

The prune function generates a restricted telescope, removing arguments that 
contain a rigid occurrence of a forbidden variable. This may fail if it is not clear 
which arguments must be removed. 

prune :: Set Nom — > Telescope — > Bwd Elim — > Maybe Telescope 
prune V • • = Just • 

prune V (A :< (x, S)) (e :< A s) = do 
A' <- prune V A e 
case toVar s of 

Just y | y e V,fv S C vars Z\' ->• Just (Z\' :< (x, S)) 
| fv ri s s £ V -»• Just Z\' 

| otherwise — )■ Nothing 
prune = Nothing 

A metavariable a can be instantiated to a more specific type by moving left 
through the context until it is found, creating a new metavariable and solving for 
a. The type must not depend on any metavariables defined after a. 

type Instantiation = (Nom, Type, Tm — y Tm) 

instantiate :: Instantiation — > Contextual () 
instantiate d@(a, T,f) = popL >■= \ e — > case e of 

E p ( U, HOLE) \ a = (3^ hole • T (\t^ define • (3 U (f t)) 

->■ do pushR (Right e) 
instantiate d 
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C.4.4 Metavariable simplification 

Given the name and type of a metavariable, lower attempts to simplify it by 
removing X-types, according to the metavariable simplification steps (4.21) and 
(4.22) in Figure 4.15 (page 80), as described in Subsection 4.2.4 (page 74). 

lower :: Telescope — > Nom — > Type — > Contextual Bool 
lower <P a (£ S T) = hole <P S $ \ s -> 

hole $ (T{s})$\t^ 

define $ a (E S T) (pair s t) > 

return True 

lower <P a (U S T) = do fresh (s2n "x") 

splitSig • x S >■= maybe 

(lower (# :< (x, S)) a (T{var x})) 
(\(y,A,z,B,s,(u,v)) ->■ 

hole *(ny:An^:5. T{s})$\w^ 
define a (n S T) (\x. w$$u$$v) > 
return True) 

lower <P a T = return False 

Lowering a metavariable needs to split S-types (possibly underneath a bunch 
of parameters) into their components. For example, y:Ux:X. Hz: S. T splits into 
y 0 : Ilo; : X. S and y\ : Ux : X. T{y 0 x}. Given the name of a variable and its type, 
splitSig attempts to split it, returning fresh variables for the two components of 
the S-type, an inhabitant of the original type in terms of the new variables and 
inhabitants of the new types by projecting the original variable. 

splitSig :: Telescope — > Nom — > Type — > 

Contextual (Maybe (Nom, Type, Nom, Type, Tm, (Tm, Tm))) 
splitSig <P x (E S T) = do y <- fresh (s2n "y") 

z <- fresh (s2n "z") 

return $ Just (y,Il$. S, z,Il$. (T{var y$*$ $}), 
A $. pair (var y $*$ <P) (var z $*$ $) , 
(A<?. var x%*% $%% Hd, 
A#. var x$*$ $%%T\)) 
splitSig <P x (n A B) = do a <- fresh (s2n "a") 

splitSig (<? :< (a, A)) x (5{var a}) 
splitSig = return Nothing 
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C.4.5 Problem simplification and unification 

Given a problem, the solver simplifies it according to the rules in Figure 4.14 
(page 79), introduces parameters and calls unify defined below if it is not already 
reflexive. In particular, problem simplification removes E-types from parameters, 
potentially eliminating projections, and replaces twins whose types are defini- 
tionally equal with a single parameter. This implements the steps described in 
Subsection 4.2.5 (page 75). 

solver :: Problem — > Contextual () 
solver (Unify q) = do b isReflexive q 

unless b (unify q) 

solver (All p b) = do 
(x, q) unbind b 
case p of 

_ | x <£. f v q — > active q 

P S — > splitSig • x S >■= \ m — > case m of 

Just (y, A, z, B, s, _) — > solver (\/y:A. Vz: B. subst x s q) 
Nothing — > inScope x (P 5) $ solver q 

St T ->• equal Set S T »= \ c -> 

if c then solver (\/x:S. subst x (var x) q) 
else inScope x (S\ T) $ solver ^ 

The unify function performs a single unification step: ^-expanding elements of 
n or S types via the problem simplification steps (4.2) and (4.3) in Figure 4.14 
(page 79), or invoking an auxiliary function in order to make progress. 

unify :: Equation — > Contextual () 

unify ((/ : n A B) w (g : n S T)) = do 
x 4- fresh (s2n "x") 

active $ Vx : A$S. (/ $$ x : B{x}) ^{g%%x: T{x}) 
unify ((* : E A B) w (w : E C D)) = do 
active $ (hd * : 4) w (hd u; : C) 
active $ (tU : 5{hd *}) « (tl u> : ,D{hd w}) 

unify g@(M a • e w M /3 ■ e') 

\ a = P — tryPrune q © tryPrune (sym q) © flexFlexSame q 
unify g@(M a ■ e M /3 • e') = tryPrune g © tryPrune (sym g) © flexFlex [ ] g 
unify q@(M a ■ e ^ t) = tryPrune q © flexTerm [ ] q 

unify q@(t xs M a • e) = tryPrune (sym q) © flexTerm [] (sym q) 

unify q = rigid Rigid q 
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A rigid-rigid equation (between two non-metavariable terms) can either be de- 
composed into simpler equations or it is impossible to solve. For example, 
Tlx : A. B « Tlx : S. T splits into A « S, B « T, but Hx : A 5 w Y,x : S. T 
cannot be solved. The rigid Rigid function implements steps (4.4)-(4.7) from Fig- 
ure 4.14 (page 79), as described in Subsection 4.2.5 (page 75). Both unify and 
rigid Rigid will be called only when the equation is not already reflexive. 

rigid Rigid :: Equation — > Contextual () 

rigidRigid ((n A B : Set) « (n S T : Set)) = do 
x fresh (s2n "x") 
active $ (4 : Set) « (5 : Set) 
active $Vx:^5.(5{f} : Set) « (T{a;} : Set) 

rigidRigid {{Y.AB: Set) « (£ 5 T : Set)) = do 
x 4- fresh (s2n "x") 
active $ (4 : Set) « (5 : Set) 
active $Vx: AJS. (5{f} : Set) w (r{i} : Set) 

rigidRigid (V x w ■ e xs V x' w' ■ e') = 
matchSpine x w e x' w' e' » return () 

rigidRigid q | orthogonal q = throwError "Rigid-rigid mismatch" 
| otherwise = block $ Unify q 

A constraint has no solutions if it equates two orthogonal terms, with different 
constructors or variables, as defined in Figure 4.13 (page 76). 
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When there are rigid variables at the heads on both sides, proceed through 
the evaluation contexts, demanding that projections are identical and unifying 
terms in applications. Note that matchSpine returns the types of the two sides, 
used when unifying applications to determine the types of the arguments. For 
example, if y: Tlx: S. T{x} — > U then the constraint y s t ^ y uv will decompose 
into (s:S) « (u:S) A (t: T{s}) « (v. T{u}). 

matchSpine :: Norn — )• Twin — > Bwd Elim — > 
Nom — > Twin — > Bwd Elim — > 
Contextual (Type, Type) 
matchSpine x w • x' w' • 

| x = x' =(,)($) lookupVar x w (*) lookupVar x' w' 
| otherwise = throwError "rigid head mismatch" 
matchSpine x w (e :< A a) x' w' (e' :< A s) = do 

(n A B, IT S T) i- matchSpine x w e x' w' e' 

active $ (a : A) « (s : S) 

return (B{a}, T{s}) 
matchSpine x w (e :< Hd) x' w' (e' :< Hd) = do 

(E A B, E 5 T) <- matchSpine x w e x' u/ e' 

return (^4, 5) 
matchSpine x w (e :< Tl) x' it/ (e' :< Tl) = do 

(E A B, E 5 T) <- matchSpine x w e x' «/ e' 

return (5{V x u> • (e :< Hd)}, T{V x' w' ■ (e' :< Hd)}) 
matchSpine x w (e :< If T s t) x 1 w 1 (e' :< If V s' f) = do 

(B,B) matchSpine x w e x' w' e' 

y «- fresh (s2n "y") 

active $ Vy : B. ( T{var y} : Type) « ( T' {var y} : Type) 
active $ (s : T{tt}) « (s' : T'{tt}) 
active $ (* : T{ff}) « (f : T'{ff }) 
return (T{V x w • e}, T'{V x' u/ • e'}) 
matchSpine = throwError "spine mismatch" 
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C.4.6 Solvitur ambulando 



Constraint solving is started by the ambulando function, which lazily propagates 
a substitution rightwards through the metacontext, making progress on problems 
where possible. It maintains the invariant that the entries to the left of the 
cursor include no active problems. This is not the only possible strategy: indeed, 
it is crucial for guaranteeing most general solutions that solving the constraints in 
any order would produce the same result. However, it is simple to implement and 
often works well with the heterogeneity invariant, because the problems making 
a constraint homogeneous will usually be solved before the constraint itself. 

ambulando :: Subs — > Contextual () 
ambulando 9 = optional popR >■= \ x — > case x of 
Nothing — > return () 
Just (Left 9') ->■ ambulando (0 o 0') 
Just (Right e) — > case update 0 e of 

e'@(E a (T, HOLE)) ->• do lower • a T © pushL e' 

ambulando 9 
Q Active p ->• do pushR (Left 9) 

solver p 
ambulando [ 
e' — > do pushL e' 

ambulando 9 

Each problem records its status, which is either Active and ready to be worked 
on or Blocked and unable to make progress. The update function applies a sub- 
stitution to an entry, updating the status of a problem if its type changes. 

update :: Subs — > Entry — > Entry 
update 9 (Q s p) = Q s' p' 
where p' = substs 0 p 

s' | p = p' = s 
I otherwise = Active 
update 9 e = substs 9 e 

For simplicity, Blocked problems do not store any information about when they 
may be resumed. An optimisation would be to track the conditions under which 
they should become active, typically when particular metavariables are solved or 
types become definitionally equal. 
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Appendix D 
Selected proofs 



This appendix contains details of selected proofs from Chapters 2-6. 

D.l Correctness of unification and type infer- 
ence 

Lemma 2.6 (Soundness and generality of unification). 

(a) If ©o \~ t = v : * H Q 1 then 0 O E @i is a minimal solution of r = v. 

(b) If Q 0 \E \- a = r : * -\ Q\ then 6o, S C 6i is a minimal solution of a = r. 

Proof By induction on the structure of derivations. For each 'unify' rule, one 
must verify that it gives a solution (i.e. Go E ©l and ©i h r = t> : *), and 
that this solution is minimal (i.e. given any other solution 9 : ©o E ©' such that 
©' h 0 r = 0 v : *, there is a cofactor C : ©i E ©' with 0 = C • 0- 

For each 'instantiate' rule one must verify ©o,5 □ @i, ©i h a = r : * and 
that given any other solution # : @ 0 , S C ©' such that ©'h^a^^r:* there is 
a cofactor C : @i E ©' with 0 = £ • t. 

The key idea is that the type variables of 0 O and ©i are the same, and the 
definitions made in ©i must hold as equations in ©' for the problem to be solved, 
so the solution 9 can be rearranged to produce the necessary cofactor. I consider 
some of the more interesting cases. 

For the decompose rule, solutions to r 0 — > T\ = v\ — > v\ are exactly those 
that solve To = vq A t± = v±, so it gives a minimal solution by the Optimist's 
lemma (Lemma 2.4). 

For the SKIP-SEMI rule, suppose that 9 : 0 O 9 C ©' , S solves a = /3, so 
©'^Hh^a^^/?:*. Now 0|e o : @ 0 C ©' by definition of the C relation, so by 



induction there exists ( : ©i □ 6' with 9 = ( ■ i. Then C : ©19 E ©' 9 S is the 
required cofactor. 

For the INST-SKIP-SEMI rule, suppose that 9 : 0 O ?S C ©'^S' solves a = r, so 
Q'^E' \- 9a = 9r : *. Now 0 O declares a by the input conditions (Definition 2.1), 
so 9 a is a ©'-type and 6>r is equal to it. Hence 9r does not depend on any 
metavariables in E'. Now all the metavariables declared in E occur in r, giving 
0 : 0 O 9 S □ and hence 9 : 0 O , E □ 0'. By induction there exists ( : ©i C 0' 
such that 9 = ( ■ l. □ 

Lemma 2.11 (Soundness and generality of type inference). If Qq \- t : t -\ 0 1; 

£/ien 0o E ©1 z' s a minimal solution to the type inference problem for t with output 
t. Similarly, if 0 O h t : er H 0 X £/ien ©o E ©1 z' s a minimal solution to the type 
scheme inference problem for t with output a. 

Proof. Proceed by induction on derivations. It is straightforward to show that 
©o E @i and ©i h t : r or @i h £ : a. The more interesting part is establishing 
that the solution is minimal, for which suppose 9 : 0 O Q ©' is a solution, and 
exhibit a cofactor £ : 0 X C ©'. 

The Generalises lemma proves the property required for the infer-gen rule. 

For the INFER- VAR rule, suppose x: (\/E.v) G @ 0 , 9 : 0 O C ©' and ©' hi:u'. 
By inversion, the proof must consist of the VAR rule, so ©' h 9 (VH.-u) >- v'. 
Thus there is some substitution ( : Q',9E C ©' such that Q' \- ( (9 v) = v' : * 
and C is the identity on ©'. Weakening 9 gives 9' : 0 O , S C ©', and hence 
C • : ©o, 2 C ©' is the required cofactor. 

For the INFER-LAM rule, suppose 9 : 0 O C ©' and ©' h Xx.t:v — >• r', then 
©', x : f h £ : r' by inversion. Now (0, v/a) : @o, a : *, x : a C 0', x : v so induction 
on the first premise gives ( : 0i, x: a, E □ 0', x: v such that (0, -u/a) = ( ■ 1 and 
0' h r' = ( r : *. Thus ( : ©i, S C ©' is the required cofactor. 

For the infer- app rule, the Optimist's lemma does not directly apply because 
it does not apply to problems with outputs, but the same reasoning applies. 1 
Suppose 9 : @ 0 C ©' and Q' \- st:r. By inversion, 0'hs:r'->T and ©' h t:r' 
for some r' . Thus induction on the first premise gives ( : ©i C ©' such that 
9 = ( ■ l and 6' h (w e r' -)• r : *. Now induction on the second premise gives 
(' : 0 2 C ©' such that £ = C ' <• and 0' h ('w' = r' : *. Since there is a solution 
(C',r/a) : 0 2 ,a : * □ ©' such that ©' h ((',r/a)v = ((',r/a) (v' -± a) : *, 
Lemma 2.6 applied to the third premise gives (" : @ 3 □ ©' with ((', r/a) = £" • 
Now 0 = C" ' 1 so C" is the required cofactor. 

1 The lemma can be generalised to apply to this rule (Gundry et al., 2010), but I omit the 
more general formulation here for simplicity of presentation. 
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For the INFER-LET rule, suppose 9 : 0 O E ©' gives 6' h let x = sin i:r', then 
by inversion, ©'^S' h s : r> and 6', x : (VEf.v) h t:r' for some v. Now 6'h s: (VE/.u) 
so by induction on the first premise there must be some ( : ©i C ©' such that 
9 = ( ■ l and ©' h C o" >~ (VS'.v). Now £ : ©i, £ : o □ ©', x : £ a so by induction 
on the second premise there must be some (' : © 2 , x : cr, S C 9', x : £ cr such that 
C = C'-tand 9',3;:(frh('r = r':*. Thus C' : ©2,2 C ©' is the required 
cof actor since 9 = (' ■ l and Q' \- (' t = t' : *. □ 

Lemma 2.12 (Completeness of type inference). 

(a) If (@o,0 is a type inference problem with solution {9 : 0 O C Q',v), then 
©o h £ : r H ©! /or some ©x and r. 

Tjf (@o, i) is a scheme inference problem with solution (9 : 0 O C 0', cr'), taen 
©o h t : a H ©x /or some ©x and cr. 

Proof. Proceed by induction on the derivation of ©' h t : v or ©' h t : cr' in the 
transformed declarative system (Figure 2.8, page 25). 

For the VAR case, ©' 3 x : a so @ 0 3 x : cr 0 for some cr 0 by definition of 
information increase, and hence the infer- VAR rule applies. 

For the LAM case, (9, r/a) : ©o, a : *,x : a C ©', x : r with r> is a solution to 
the type inference problem for t, so by induction, 0 O , a : *, x : a h £ : r' H ©^ for 
some ©' and r'. Moreover, 0' x = ©i, x : a, S by soundness of type inference and 
they definition of information increase, so the infer-lam rule applies. 

For the APP case, inversion gives ©' h s : r' — > t and ©' h t : r'. Two appeals 
to the inductive hypothesis show that inference succeeds for s and £, with types 
v and i/. Now generality of type inference gives £ : ©2 E ©' such that 9 = ( ■ l 
and ©' h = r' -»• r A = r'. Then ((,r/a) : 0 2 ,a : * C ©' and 
©' h (v = (v' — > t : * so Lemma 2.8 shows that the infer- APP rule applies. 

For the let case, observe that 0' h s : VS.r> so by induction using part (b), 
0o h s : cr H 0i for some ©i and cr. By generality of type inference, there exists 
( : 6i C 6' such that 0' h (a y VE.i>. Note that C : ©l, x : a □ ©', x: (a. Now 
©', x : VS.-u h f:r and hence ©', x : £ cr h t : r, so the INFER-LET rule applies by 
induction. 

In part (b), suppose 9 : ©o E ©' is a solution to the scheme inference problem 
for £, with output VS'.u. Then 0'?S'ht:u. Now 9 : ©09 C0';S' so induction 
using part (a) gives ©o? \~ t : r H ©i^S and hence the infer-gen rule applies. □ 
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D.2 Correctness of abelian group unification 



Lemma 3.2 (Soundness and generality of abelian group unification). If the group 
unification algorithm succeeds with 0 O \\ T h v = f : U H @i 7 then @o,^ C ©i is 
a minimal solution of v = 1:11. 

Proof. Proceed by induction on derivations. For soundness, it is easy to verify 
that 9 : ©o,T □ Q 1 and 0i h 9 u = 1 : U. Now consider generality for each 
rule in Figure 3.5 (page 37). In each case, suppose 9 : 6 0 ,T □ 6' is such that 
0' h v = 1 :U, and exhibit a cofactor ( : ©x C ©'. 
For u-trivial, the result is obvious. 

For u-SKIP-semi, if T is empty then the result is straightforward. Otherwise, 
T contains a single unknown variable /3 : U; let u = /3 k * v' . Moreover, suppose 
9 : ©o % /3 : U C ©' § 5 is such that ©' ? S h 0 (/3 fc * i/) = 1. Rearranging gives 
e'^Hh (#/3) fc = (0i/) but 0 z/' is defined over ©' so 0 /3 must be defined over 
©'. Thus 9 : ©o, /3: and the result follows by the inductive hypothesis. 

For U-SKIP-TY, U-SKIP-TMand U-SUBS, it is straightforward to check that the 
inductive hypothesis gives the required cofactor. 

For u-define, suppose 9 : 0 O , a : U, T C ©' is such that &' \- 9 (a k * v k ) = 1. 
Then ©' h [9 (a * v)) k = 1 and hence ©' h 9 (a*v) = 1 for the /ree abelian group. 
Thus ©' h 0 a = 0 (z/- 1 ) and so 9 = 9 ■ [v^/a] : © 0 , T □ ©'• 

For U-REDUCE, apply the isomorphism lemma (Lemma 2.5, page 18). The 
inductive hypothesis gives that Q 0 ,Y, f3 : W C ©i is a minimal solution of 

* R fc (z/) = 1. Moreover [a * Q fc (i/) _1 //3] : ©o, T, /3 : W □ @o, a : W, T is an 
isomorphism with inverse [f3 * Qk(i/)/a] : ©o,« : ZY,T □ 0 O ,T, /3 : W, so the iso- 
morphism lemma gives that ©o,« : ZY,T □ ©!,« := /3 * Qfc(z/) : W is a minimal 
solution of a k * v = 1. 

For u-COLLECT, appeal directly to the inductive hypothesis. □ 

Lemma 3.3 (Completeness of abelian group unification). If v is a well-formed 
unit of measure in ©o, and there is some 9 : ©o □ ©' such that ©' h 9 v = 1 :U, 
then the algorithm produces Q 1 such that ©o || • \~ v = 1 : U H ©i. 

Proof. First, establish termination of the rules when viewed as an algorithm, 
where hypotheses correspond to recursive calls. Termination is by the lexico- 
graphic order on the total length of the context (including T), the maximum 
power of a variable in the expression being unified, and the length of the first 
part of the context (excluding T). Only the U-REDUCE and U-COLLECT rules do 
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not decrease the total length on recursive calls; moreover, u-REDUCE decreases 
the maximum power of a variable and U- collect decreases the length of the first 
part of the context. Note that the final result may be longer than the original 
context, due to u-REDUCE. 

The algorithm terminates, so proceeding by induction on the call graph allows 
reasoning about completeness. By inspection of the rules, observe that only two 
possible cases are not covered: either v is a constant that is not equal to 1, or v 
contains exactly one variable a, and the power of a does not divide the powers 
of the constants. In either case, there are no possible solutions of the unification 
problem v = 1:U. 

Finally, note that each rule preserves solutions: that is, if the initial problem 
(conclusion of the rule) has a solution then the rewritten problem (hypothesis of 
the rule) must also have a solution. Hence failure of the algorithm indicates that 
the original problem had no solutions. □ 

Lemma 3.4 (Soundness and generality of type unification). 

(a) If Qo \~ r = v : * ~\ Q\, then 0 O Q ©1 is a minimal solution of r = v.*. 

(b) IfQo | S h a = r : * H 0 1; then ©o,S □ 61 is a minimal solution of a = r:*. 

Proof. Proceed by induction on the structure of derivations, as in Lemma 2.6 
(page 22). The majority of the cases are similar to the previous proof, but 
the unit rule is new, the inst rule has been modified. The inst-skip-semi rule 
requires a more subtle generality proof, in order to verify that instantiation moves 
only genuine dependencies. The input conditions ensure that units always occur 
in the form ¥(a), so it is obvious that a is a dependency. 

For the unit rule, the result follows from the soundness and generality of 
abelian group unification (Lemma 3.2). 

For the inst rule, use the Optimist's lemma (Lemma 2.4, page 18), which 
states that the minimal solution to a conjunction of problems is found by 'op- 
timistically' solving the first problem in the original context, then solving the 
second problem in the resulting context. This rule fits the pattern as solutions 
to a = t{ VI 1 } : * are the same as solutions to (a = r{ (3i % } : *) A /3; = Vi\U. % up 
to the equational theory. 

Recall the inst-skip-semi rule 

O 0 |HI-a = T:*H6i 
6 0 9 |Sha = r:HOi;, 
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and suppose 9 : 6 0 9 S C 6' 5 S' is such that 0' ^ H' h # a = # r : *. Now ct : * e 6 0 
by the conditions for the algorithmic judgment, so 9 a is a ©'-type and 9r is 
equal to it. In the previous proof, I argued that 9r could not depend on S', but 
this does not hold for the equational theory of abelian groups, because equivalent 
expressions can have different sets of free variables. However, if (5 : U G S then 
F(/3) is a subterm of r, so ¥(9/3) is a subterm of 9r and hence there is some of 
©'-unit v with 9 (5 = v. Similarly, if 7 : * G S then 7 G fmv(r) so #7 is defined 
over ©'. Hence there is some 9' : 0 O § S C 6'? with 0 = 0', so 0' : © 0 , S C ©' 
and by induction there exists C : ©i Q ©' as required. □ 

Lemma 3.5 (Completeness of type unification). 

(a) If the types v and r are well-formed in ©0 and there is some 9 : 0 O C ©' ui^/j 
0' h = 9r:*, then unification produces 61 such that 0 O l~ v = r : * H 9i. 

^ Moreover, if 9 : Oo,S C @' is snc/i i/iai 0' h 9 a = 9t : * and the input 
conditions (Definition 3.1) are satisfied, then there is some context ©x such 
that 0o|Sha: = T:*H©i. 

Proof. First establish that the system terminates, if viewed as an algorithm with 
inputs ©o (and S), v (or a) and r, giving outputs ©i and 9. The 'unify' judg- 
ments terminate because each recursive call removes a type metavariable from the 
context, decomposes the types or removes a unit metavariable. The 'instantiate' 
judgments either shorten the whole context or the part of the context before the 
bar. Note that the INST rule may add unit metavariables, but a type variable will 
be removed from the context by instantiation. Only the decompose rule makes 
more than one recursive call to type unification, and it decomposes types so it 
does not matter that the intermediate context may have more unit metavariables. 

Now proceed by structural induction on the call graph, observing that each 
rule in turn preserves solutions, and that all (potentially solvable) cases are 
covered. The only cases not covered are rigid-rigid mismatches (e.g. unifying 
v — > t with F(z/)) and the flex-rigid problem a = r in context ©o, ct : *, S where 
a G fmv(r). The latter has no solutions because the occurs check fails (if a is 
in S then the conditions of the lemma ensure r depends on it), as in Lemma 2.8. 
The algorithm may also fail in abelian group unification, for which completeness 
is by Lemma 3.3. □ 
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D.3 Correctness of Miller pattern unification 



D.3.1 Consistency of the unification logic 

To prove consistency of the unification logic, as described in Section 4.3 (page 
81), it is enough to show that every derivation has a normal form. 

Lemma 4.9. If 0 is solved, 0 | T h P and 5 is a substitution from T to A that 
identifies twins, then 0 | A h 5 P is. 

Proof. By induction on the derivation of 0 | V h P. 



Case 



Case 



©1 


r h ctx 


0 


r h t 



Trivial. 



0 


rhi 


0 


r h Pwf 


0 


rhi 5 



. By induction, 0 | A h _L is, which is 



impossible. 



Case 



0 


\T,x:S h P 


0| 


ThVx-.S.P 



. 0 I A, x:5 S h 5 P is follows from the inductive hypoth- 



esis, and hence 0 | A h Vx : 5 S. 5 P is. 



Case 



0|rh(S:Type)^(r:Type) 
&\T,x:StT h P 

Q\Th\/x:StT.P 



Similarly to the previous case, induc- 



tion gives 0 | A h Type B^S^ U ^ 5 T and & \ A, x : U h 5 P{x, x} is, and 
hence 0 | A h Vx : 5 St<5 T.dPis. 



Case 



0 | T h Type 3 S =[ U ^ T 

Q\T,x: Uh P{x,x} 
Q\ThVx:StT.P 



Similar to the previous case. 



Case 



0|T h \/x:S.P 

©|r h s 3 s 
©|r h p{ s } 



. By induction, 0 | A h Vx : 5 S. 5 P is, so inversion gives 



0 | A, x : 5 S h 5 P is. Then 0 | A h 5 P{s} is by the substitution lemma. 
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Case 



e|r h Vx-.stT.p 

0 | r h Type 3 S =[ U ^ T 

0|T h U 3 u 

©|r h p{ u ,u} 



By induction, 0 | A h Vx : 5 T. 5 P is, 



so 0 | A h Type 9 5 5 =[ £/ ^ 5 T and 0 | A, x : [/ h <5 P{x, x} is by inversion. 
Then 0 | A h U 3 5 u so substitution gives 0 | A h 5 P{5 u, 5 u} is. 



Case 



0 3lP 


0 


r h ctx 


0 


r h p 



P is solved, so use Lemma 4.7 (page 81). 



Conjunction introduction and elimination. Straightforward appeal to the 
inductive hypotheses. 



Case 



Q\r\-Ux:A.B ^Ux:S. T 
0|T h A S AVx:AtS.B{x} « T{x} 



. The inductive hypothesis gives 



0 | A h Tlx : 5 A. 8 B w Tlx : 5 S. 5 T is so inversion using the definition of is gives 
0 | A h Set 3 Tlx : 5 A. 5 B = Tlx : 5 S.5 T. Then inversion on the definitional 
equality gives 0 | A h Set 3 5 A =[ U ]= 5 5 and 0 | A, x : C/ h Set 9 5 5 = 5 T. 
Thus 0 | A h £4 « 5 5 AVx:5 5t5 T.5 5{f} « 5 r{x} is. 



Case 



0|rhEx:A 5^E:r:S. T 
0|T h 4 « 5 AVx:^5.5{x} « T{x} 



. Similar to the previous case. 



Reflexivity, symmetry and transitivity. By Lemma 4.1 (page 65) and the 
definition of 0 | A h (s : S) « (t : T) is. 



Case 



rax: 


S\T 0 


r h ctx 


0 


rh(i:S)«( 


x: T) 



Here 5 identifies twins, so it must be the 



case that 0 | A h (5x:5S) « (5x:5 T) is. 



Congruence rules (Figure 4.7, page 62). Each congruence rule corresponds 
to a rule of the definitional equality, except for the presence of twins. The hetero- 
geneity invariant means that the types of twins are provably equal, so induction 
means they are definitionally equal and can be replaced with a single variable. □ 
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D.3.2 Soundness 

If twins have definitionally equal types, they can be replaced with a single variable: 

Lemma D.l. Suppose 0 | Y h Type 3 A =[ U ]= S. Then 0 | T h Vx: AfS. P f/ 
and on/?/ if 0 | T h Vx : U. P{x, x}. 

Proof. For the forward direction, observe that 0 | T, y : [/ h Vx : ylf 5\ P and in- 
stantiate this with (y, y)/x to get 0 | T, y: [/ h P{?/, y}, so 0 | T h Vy: C/. P{y, y}. 
For the reverse direction, if 0 | T h Vx : C/. .P{x, x} then Q\T,y : U h Vx : 
[/. P{x, x} so 0 | T, y : C/ h P{y, and hence 0 | T h Vx : St T. P{s, t} by an 
inference rule. □ 

The following lemma justifies decomposition of rigid-rigid equations between 
eliminated variables, which is part of the soundness of problem decomposition. 

Lemma D.2. Suppose x • e dxi x' ■ e' ^ P, 0 | T h x • e G S and 0 | V h x' ■ e' e S' . 
Then 0 | T h P wf , and if 0 | V h P then 0 | V h x ■ e « x' • e'. 

Proof. Prove both parts simultaneously by induction on e. □ 

All the judgments are insensitive to ^-contraction: 

Lemma D.3. 

(a) 7/0 | T h T 9 ^{Ax.nx} £/ien 0|T h T 3 t{Xx.nx} = t{n}. 

(b) IfQ |T h T 9 £{(>HD,nTL)} £/ien 0|T h T 3 t{{n ao , htl)} = t{n}. 

(c) IfQ\T{\x.nx} h P{Ax.nx} then 0 | T{n} h P{n}. 

(dj //0|r{(n H D,nTL)}hP{(nHD,nTL)} then &\T{n} h P{n}. 

(e) If 0 | T{\x.n x} h P{Ax.n x} is then 0 | T{n} h P{n} is. 

0 7/0|r{(nHD,n T L)} h P{(n H D,nT L )} is iaen 0 | T{n} h P{n} is. 

(g) If 0 | T{\x.n x} h P{Ax.n x} wf and 0 | r{n} h P{n} £/ien 
0|r{Ax.nx} h P{Ax.nx}. 

(h) If 0 | T{(n hd, nu)} h P{(n hd, btl)} wf and 0 | r{n} h P{n} iaen 
0 | T{ (n hd, n tl) } h P{ (n hd, n tl) } . 

Proof. Parts (a) and (b) are by structural induction on derivations. The remain- 
ing parts follow from them by induction on derivations, using context conversion 
(Lemma 4.5) and conversion (Lemma 4.6). □ 
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The problem decomposition operation, summarised in Figure 4.14 (page 79), 
is sound in that it preserves well-formedness and provability of problems: 

Lemma 4.12. // 0 | T h P wf and P Q then 

(a) 0 | T h Q wf , and 

(b) 0|T h Q implies 0 | T h P. 

Proof of part (a). For reflexivity (4.1), Q is trivial and hence well-formed. 

For ?7-expanion of functions (4.2), 0 | T h Tlx : A. B Tlx : S.T from the 
definition of problem well-formedness, so 0 | T h A pz S A Vx : A$S. B{x] k, T{x} 
by injectivity. The case of ^-expansion of pairs (4.3) is similar. 

For rigid-rigid decomposition of equations between Il-types (4.4) or E-types 
(4.5), the second component of the conjunction is well-formed because the first 
component may be assumed as a hypothesis. 

For rigid-rigid decomposition of variable applications (4.6), use Lemma D.2. 

For rigid-rigid mismatch (4.7), Q is false and hence well-formed. 

For ^-contraction of subterms (4.8), (4.9), use Lemma D.3. 

The cases that drop unused parameters or twins (4.10)-(4.13) correspond to 
proving admissibility for appropriate forms of strengthening. 

Simplification of identical twins (4.14) and S-splitting of parameters (4.15) 
give well-formed results by the substitution lemma (Lemma 4.1, page 65). □ 

Proof of part (b). For reflexivity (4.1), P holds definitionally. 

For the steps that perform 77-expansion and rigid-rigid decomposition of LT or 
S-types (4.2)-(4.5), in each case, P follows from Q by a single application of the 
appropriate congruence rule from Figure 4.7. 

For rigid-rigid decomposition of variable applications (4.6), use Lemma D.2. 

For rigid-rigid mismatch (4.7), the proof of Q = _L can be eliminated to 
produce a proof of P. 

For 77-contraction steps (4.8) and (4.9), use Lemma D.3. 

The cases that drop unused parameters or twins (4.10)-(4.13) correspond to 
proving admissibility for appropriate forms of weakening. 

Lemma D.l proves the required property for simplification of twins (4.14). 

For S-splitting of parameters (4.15), instantiating Q with AA.xAhd for y and 
AA.xAtl for z gives the P (up to uses of surjective pairing, using Lemma D.3). 

□ 
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Lemma D.4 (Soundness of pruning). Suppose 0 = 0 O ,/3:IIA. T, ©i. 

(a) 7/0 h mctx and prune V A 1^ A' then 0 O | A' h ctx ; vars(A') C vars(A). 

(b) 7/ 0 h mctx and pruneTm Vt ^ A') then 0 O | • h Type 9 IIA'. T and 
0 o ,7:nA'. T|- hnA.T9AA.7A'. 

Proof. Part (a) is by induction on the definition of prune, observing that the 
bindings in A' are a subset of those in A, and that prune retains a binding x: S 
only if the free variables of S have been retained in A'. Part (b) then follows 
from part (a), and the fact that pruneTm checks that fv(T) C vars(A'), so the 
type IIA'. T is well-formed. □ 

Lemma 4.13. J/0h mctx and © i-> ©' then i:Q □ 0'. 

Proof. By induction on the step taken. 

For inversion (4.16), 1 : 0, a : T, 0 □ 0, 0 O , a := A xl l .t : T, ©i since ©o, ©i is 
a dependency-respecting permutation of 0 and the solution for a is well-typed. 
Moreover VT. a ~x~i 1 ~ t holds since a~xi l = (A^ 1 -t)xi l = i. 

For occurs check failure (4.17), the result is trivial since any problem is true 
in a failed metacontext. 

For equation solving by intersection (4.18), observe that VT. axl % ~ alfi 1 
holds since aa^ 1 = (AA./3 A') 25 8 = /3A' by the definition of intersection, and 
similarly uyi 1 = (AA./3A')W 4 ' = (3 A'. 

For pruning (4.19), use Lemma D.4. 

For pruning failure (4.20), the result is trivial since 0' is failed. 

For S-splitting (4.21), it suffices to check that if 0 | • h Type 9 IIA. T,x:S.T 
then 0|- h Type 3 IIA. S; Q,a 0 : IIA. S | • h Type 9 IIA. T{a 0 A} and 
0, a 0 : nA. 5, ai : nA. T{a 0 A} | • h IIA. T,x:S.T3 AA.(a 0 A, cti A). 

For uncurrying (4.22), a similar check is needed. 

For problem decomposition (4.23), Lemma 4.12 gives that 0, ? VT. P h mctx 
and P & Q implies 0,?Vr. Q\- h VT. P, since 0, ? VT. Q | T h Q. 

For conjunction splitting (4.24) and removing trivial problems (4.25), the 
result is trivial. 

For the symmetry step (4.26), the result follows by induction and symmetry 
of the definitional equality (Lemma 4.4). 

For the suffix step (4.27), observe that if 9 : 0 □ 0' and 0, ©o \~ mctx then 
©', #©0 h mctx and weakening means that (9, 1) : 0, 0 O E ©', ^©o- D 
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D.3.3 Generality 

I will need standard no confusion and no cycle results for the definitional equality, 
in order to prove that the steps that reject impossible equations are most general. 

Lemma D.5 (No confusion). If s JL t then there are no 0 and T such that 
0 | T h T 3 s = t. Moreover, if s JL t then 9 s X 9 t for any metasubstitution 9. 

Proof. By induction on the derivation of s i t, and inversion on the definitional 
equality relation for the first part. □ 

Lemma D.6 (No cycle). Suppose t contains a strong rigid occurrence of a t { 1 , 
or a rigid occurrence of alfi % ■ Then there are no 0, T, 9 and T such that 
9|r h T 3 6{ax i i ) = 9t. 

Proof. Suppose otherwise, and without loss of generality assume that 9 substi- 
tutes A %i 1 .s for a, so 9 (a xl l ) = s. If a tj occurs strong rigidly (under a canonical 
constructor such as II) in t, then [ U/x-i ] s — [ U/xi ] (9 t) occurs strong rigidly in 
9 t. But substitution cannot remove strong rigid occurrences of subterms, so re- 
peating this observation shows that s contains an infinitely deep tree of canonical 
constructors, which is a contradiction. 

If ay~i l occurs rigidly (under a canonical constructor or variable) in t, then 
[ yijxi\ s occurs rigidly in 9 t. Now renaming does not change the size of a term, 
so s is the same size as a subterm of itself, which is a contradiction. □ 

Lemma D.7. If G \ F h T 3 s = t then fv(s) = fv(t). 

Proof. By induction on the derivation. □ 

Lemma 4.15 (Generality of problem decomposition). IfQ \ T h Pwf, the meta- 
substitution 9:Q, ? Vr. P C ©' is a solution and P & Q, then 0:6, ? VT. Q □ 0'. 



Proof. Lemma 4.2 (page 65) implies 0' | • h 9 (VT. P), so 0' | • h 9 (VT. P) is by 
Corollary 4.10 (page 82). Now proceed by case analysis on P Q, supposing 
that 9 (VT. P) is solved and showing that 9 (VT. Q) is solved. Without loss of 
generality assume that V contains no twins, 2 so suppose 0' | 9Y h P is and show 
that 0' | 9Y h Q is. 

For reflexivity (4.1), Q is trivial. 

For the ^-expansion and rigid-rigid decomposition steps (4.2)-(4.6), each case 
follows from inversion on the definitional equality: for example, consider the rule 

2 By definition, a problem involving twins is solved if the types are equal and the correspond- 
ing problem without twins is solved. 
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for Il-types (4.4) . If 0' | 9Y h Set 3 Tlx : 9 A. 9 B =[ Tlx : U. V ]= Tlx : 9 S. 9 T then 
0' | #r h Set 3 9 A =[ [/ ^ 0 5 and 0' | BY, x : [/ h Set 3 9 B =[ V ]= 9 T by 
inversion. Hence 0' | • h 0 (VT. ^ « 5 A Vx : 4t.£. « T{x}) is. 

For the rigid-rigid mismatch step (4.7), observe that metasubstitution cannot 
remove rigid differences, and rigidly different terms cannot be definitionally equal, 
by Lemma D.5. Thus there can be no solution 9. 

For the 77-contraction steps (4.8) and (4.9), use Lemma D.3. 

The cases that drop unused parameters or twins (4.10)-(4.f3) correspond to 
proving admissibility for appropriate forms of strengthening. 

For simplification of twins (4.14), there is nothing to prove, as the definition 
of Vx:StT.Pis means 0|T h Type 3 S =[ U ]= T and Vx: U. P{x, x} is. 

For S-splitting of parameters (4.15), use Lemma 4.7 (page 81). □ 

Lemma D.8 (Generality of pruning). // pruneTm (fv(e)) t H> (/3, A') and there 
is some 9 : 0, 0 : ITA. T, 0' C ©i such that ©i | 0r h U 3 9 {a- e) = 9 t, then there 
exists C : 0,7 :n A'. T,/3 := AA.7 A':IIA. T, 0', ? VT. a • e « * C ©i 0 = C • <<• 

Proof. Let 0 = (0o,s//3,0i) and observe that s = AA.n up to ^-conversion. To 
see that fv(it) C vars( A'), suppose otherwise, i.e. assume Xj G fv(u)\vars(A'). By 
definition of pruning there is some subterm (3 U 1 of t such that prune V A ti 1 t— > A'. 
Thus 9 t contains some 6 tj with fv rig (0 tj) <£ V. Hence fv(0 *) £ fv(0 (a • e)), which 
contradicts Lemma D.7. Thus fv(it) C vars(A'), so the cofactor ( can be taken to 
be (0 o ,(AA'.u)/ 7 ,(AA.u)//3,0i)- □ 

Theorem 4.16 (Generality). If Qq ^~ mctx, the metasubstitution 9:Q 0 \ZQ' is 
a solution and 0 O i->- ©i then there exists a cofactor ( :©i □ 0' such that 9 = (-l. 

Proof By induction on the step taken. In each case, construct a suitable cofactor 
(. If the induced metasubstitution l:Q 0 □ 0! is an isomorphism, its inverse can 
be composed with 9 to obtain the required cofactor (Lemma 2.5, page 18). 

For equation solving by inversion (4.16), let ( be the appropriate permutation 
of 9. Observe that 9 is a solution so 0' | • h 6(VY.ctxl* £) is and hence 
9'|0r h T 3 (9a)x- i i = 9t. Then 0' | • h nA. T 3 9a = 9 (Xxi* .t) by 
congruence of A, 77-expansion and strengthening, so 9 = ( ■ t. 

For occurs check failure (4.17), there can be no solution 9 by Lemma D.6. 

For equation solving by intersection (4.18), 0' | • h 0 (VT. axl 1 rs «?/i 4 ) is 
implies ©' | 0r h T 9 ^a)^ ! ' = (9a)Yi { - Up to 77, 0a is of the form XA.t, 
and any variable bound in A corresponding to distinct variables in Xi l and Yi l 
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must not occur in t, as the above definitional equality would fail. Hence ( can 
substitute XA'.t for f3. 

For pruning (4.19), use Lemma D.8. 

For pruning failure (4.20), observe that metasubstitution cannot add free 
variables (i.e. fv(0 s) C fv(s)) or remove rigid occurrences of free variables (i.e. 
fv ng (s) C fv ng (6>s)), so the existence of a solution would contradict Lemma D.7. 

For ^-splitting (4.21), the induced metasubstitution is an isomorphism, with 
the inverse given by substituting AA.q;hd for a 0 and AA.q;tl for a±. 

Similarly, uncurrying (4.22) induces an isomorphism (with the inverse given 
by currying). 

For problem decomposition (4.23), Lemma 4.15 shows that ( = 9 suffices. 

For conjunction splitting (4.24) and removal of trivial problems (4.25), the 
induced metasubstitution is an isomorphism. 

For the symmetry step (4.26), the result follows from the inductive hypothesis 
and the fact that definitional equality is symmetric. 

For the suffix step (4.27), the result follows by induction. □ 

D.3.4 Partial completeness 

Lemma 4.17. Suppose 6 is a well-formed metacontext in the pattern fragment 
that is not solved or failed. Then 6 (-> 6' for some Q' in the pattern fragment. 

Proof. By case analysis on the first unsolved problem in 6, using step (4.27) to 
skip later problems. If the first problem is a conjunction, step (4.24) applies. If 
not, it is of the form VT. (s : S) ss (t : T). Without loss of generality, assume 
that T contains no twins (otherwise they can be removed by step (4.14)). Now 
0|T h (S : Type) (T : Type) by the heterogeneity invariant, and hence 
0|T h Type 3 S = T by Corollary 4.10. In particular, fv(£) = fv(T) by 
Lemma D.7. 

If j3 • e' is a subterm of s or t, the pattern condition means that e' consists 
only of projections and applications to variables. But any projections may be 
eliminated by the lowering step (4.21), so assume it includes only variables. 

Now consider the possible cases for s and t. If they are identical, then step 
(4.1) removes the reflexive equation. If one of them is a function or pair, then 
the appropriate 77-expansion step (4.2) or (4.3) applies. 

If they are both rigid, then either the heads match so one of the decomposition 
steps (4.4)-(4.6) applies, or they do not and the algorithm fails with (4.7). 
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Otherwise, one of them is flexible. Suppose without loss of generality, using 
the symmetry step (4.26) if necessary, that s = ax{\ and consider the possible 
cases for t. 

If t = aYi % then step (4.18) applies: intersection always succeeds, and the 
condition on the free variables must hold since S and T have the same free 
variables, so any variable removed by intersection cannot occur in the type of a. 

If t has a flexible occurrence of a variable that is not one of the X{\ then 
pruning will take a step (4.19); the pattern condition ensures it will not get 
stuck. If t has a rigid occurrence of a forbidden variable, then unification will fail 
with step (4.20). 

If t contains a rigid occurrence of a, then the occur check step (4.17) ap- 
plies, since the evaluation context of a consists only of variables. By the pattern 
condition, t contains no flexible occurrences of a. 

Finally, to apply the solution step (4.16), an appropriate permutation of the 
metacontext must exist, so that all the dependencies of t can be moved before 
a. Observe that the type of t does not transitively depend on a, since it is equal 
to the type of axi 1 . Now by induction on the typing derivation for t, using the 
pattern condition and the fact that t does not contain a, none of the subterms of t 
have types that depend on a. In particular, none of the metavariables that occur 
in t have types that depend on a, so an appropriate permutation exists. (This 
induction requires the result type of an if-expression to contain no metavariables.) 

□ 



D.4 Consistency of evidence language coercions 

The overall structure of the consistency proof for coercions in the evidence lan- 
guage is described in Section 6.5 (page 131). Here I will detail the proofs that 
were previously omitted, and prove required additional results. 
Note that the reduction relation is closed under substitution: 

Lemma D.9. If p p' then [5/ A] p [5/ A] p' . 

Proof. By induction on the reduction step used. □ 
Lemma 6.14 (Transitivity). If A k (r ~ v) and A k (v ~ k) then A fc (r ~ k). 

Proof. Proceed by induction on k and inversion on (</?). 

Consider the case for quantifiers, where Afc((ai: T Ki) — > T\ ~ (a 2 : r k 2 ) — > r 2 ) 
and A k ((a 2 : T k 2 ) ->■ r 2 ~ (a 3 : T k 3 ) ->■ r 3 ). By definition, A fc (7i : k 1 ~ k 2 ) 
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for some 71, and A k (K 2 ~ /t 3 ), so induction gives A k (Ki ~ /t 3 ). In order to 
demonstrate that A k ((ai : T Ki) — >■ ri ~ (03 : T k 3 ) — >■ r 3 ), suppose f 1 and t> 3 have 
A;((f x : K4) ~ (-u 3 : /t 3 )) for / < k, and seek to prove Ai([v 1/ ai] r x ~ [v 3 / a 3 ] r 3 ). 
But A z ([i>i/ai] ri ~ [vi > 71/02] r 2 ) and A z ([i>i > 7i/a 2 ] r 2 ~ [^3/03] tjs), so the 
result follows by induction. 

The other cases where all three types are structural are similar. 

If all three types are computational, then they can each take a step by defini- 
tion, and the reducts are related by induction. 

If t is computational but v and k are structural, then the definition gives r' 
structural or coerced such that r — >* t' and A k (r' ~ v). Then induction gives 
A fc (r' ~ k) and hence A k (r ~ n) by definition. 

If t and v are computational but k is structural, then the definition gives v' 
structural or coerced such that v — >* v' and A k (v' ~ k). Then there exists r' 
such that r — >* t' and A k {r' ~ v'), so induction gives A k {r' ~ /t) and hence 
A fc (r ~ k). 

The other cases where some of the types are computational and some are 
structural are similar. 

If any of r, v and k are coerced, then the coercion(s) can be removed and the 
underlying types are compatible by induction. □ 

I need a couple of auxiliary results to prove that compatibility is closed under 
reduction. The first is straightforward. 

Lemma D.10. Suppose ■ h H : v (A) — > t and ■ h tc u : A. Then for any k, 
A fc (H to ~ H at) if and only if A k (u : A). 

Proof. By induction on the length of 00. □ 

Showing that compatible expressions satisfy progress is more interesting. This 
does not imply progress in general, because only type expressions (at phase V) 
are covered and they must be in the diagonal of compatibility. 

Lemma D.ll (Progress for compatible expressions). If A k (r ~ r) for k > 0 
then either r is a coerced value type or r can take a step. 

Proof. By induction on r and inversion on A k (r ~ r). If r is computational 
then the definition states that it can take a step. If r = t' > 7 is coerced then 
A k {r' ~ t') so by induction either r' is a coercion, a coerced value or can take a 
step, which implies the result. Otherwise, r is structural: either it is immediately 
a value type, or it is an application r' p and A k {r' ~ r') so induction on r' implies 
the result. □ 
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To deal with one coercion being cast by another, I need to show that compat- 
ibility of two propositions (y?i ~ p 2 ) means compatibility of ipi implies compati- 
bility of ip 2 . Observe that p\ and ip 2 are syntactically restricted to be quantified 
equations, not arbitrary types. Proving this lemma is the motivation for restrict- 
ing quantification at phase □ to syntactic propositions only. 

Lemma D.12. If A k ((pi) and A k (pi ~ p 2 ) then A k ((p 2 ). 

Proof. Proceed by induction on k and case analysis on pi and ip 2 . Since they are 
both equations or quantified propositions, the definition of A k (pi ~ <p 2 ) implies 
that they have the same form. 

If tpi — T\ ~ V\ then p 2 = r 2 ~ v 2 where A k (ri ~ r 2 ) and A k (vi ~ v 2 ). 
Moreover A k (ri ~ vi), so A k (r 2 ~ v 2 ) by transitivity (Lemma 6.14). 



If w = (ci : D p[) ->• </// then ^ 2 = (c 2 : D y?' 2 ) -»• ^2- For A fc ((c 2 : D </?' 2 ) -> ^2), 



suppose i] is such that Ai(i] : tp' 2 ) for some I < k. Now A;(7 : ip 2 ~ v^i) f° r some 
7 by definition of A k (pi ~ </? 2 ) and downward closure (Lemma 6.15, page 135). 
By induction, Ai(r) > 7 : y^). Then A;([?y > 7/ci] </?'/ ~ fa/^]^') ^y definition 
of A k (cp 1 ~ </? 2 ). Moreover, Ai([r) > 7/ci] </?") by definition of A fe ((/? 1 ). Hence 
A/([ry/c 2 ] <p 2 ) by induction, so A fc ((c 2 : D ip f 2 ) — >■ </? 2 ) as required. 

If </?i = (xi:^ri) — )• then tp2 = (x 2 :^r 2 ) — >■ (p' 2 . Now the assumptions imply 
Ak(p'i) and Afc((/9 / 1 ~ </? 2 ), so A k ((p 2 ) by induction, and hence A k ((p 2 ). 

Finally, if ipi = (a\ : t K\) — > <p\ then tp 2 = (a 2 : T k 2 ) — > ip 2 . To show 
Afc((a 2 : T k 2 ) — >■ </? 2 ), suppose • h r : v /t 2 and A;(r ~ r) for some I < k. Now 
Ak((fi ~ y? 2 ) implies Afc(?7 : « 2 ~ «i) for some 77. Moreover, A;([r > 77/ai] y^) 
and A/([r > 77/ai] ^ ~ \j I °v\<$' 2 ) follow from the assumptions, so A;([r/a 2 ] p 2 ) 
by induction. Hence A fc ((a 2 : T k 2 ) — > y?' 2 ) as required. □ 

The following result shows that the step coercion preserves compatibility. 

Lemma 6.16 (Reduction preserves compatibility). If t kpush > i> and A k {r ~ r) 
t/ien Afc_ i(r ~ -u). 

Proof. By induction on A; and the reduction step r — >■ v. 



Hence A k _i(p ~ p') by induction, so A fc _ x ( p > 77 ~ /?' > 77) as required. 



Case 




. If A fc (p>?7 ~ p> 



■q) then A fc (p ~ p) and A fc _i(?? : </?). 



252 



Case 



p — y p 



pp 



p'p" 



If A fc (pp" ~ pp") then A fc (p ~ p) so by induction 



A fc _i(p ~ p') and hence A fc _i( pp" ~ p'p"). 



Case 



kpush / 
p > P ' 



case p of br/ — > case p' of br/ 



. If A fc ( casep of br^ ~ case p of 6r/) 



then Afe_ i( case p' of 6r 4 - * ~ case p' of br/) by definition. By Lemma D.ll, there 
is r with case p' offer/ — >■ r, and induction gives Afc_ 2 ( casep' of &r; * ~ r). 
Hence by definition A fc _ x ( casep of 6r f * ~ casep' of br/). 



Case 



Case 



kpush / 

e — — > e 




br' 0 = br 0 > step e . 


. br n = br n > step e 


dcase e of br 0 . . . br n — 


->• dcase e' of frr^ . . . br' n 



. Similar to previous case. 



K A — > p e 6r, 
case K ^ 5 of 6r, * — >■ [5/ A] p 



If A fc (case K ip 5 of 6r 8 - ~ case K^ 8 of br/) then A k _i([5/A] p ~ [6/ A]p) by 
definition. If [5/A]p is computational, then proceed as in the previous two cases. 
If it is structural, then Afc_i( case Kip 5 of br/ ~ [5 / A] p) is immediate from 
the definition. If it is coerced, then unwrap coercions until a computational or 
structural type is reached, and the required property follows as before. 



Case 



Case 



Case 



K A ->• p e bri 



dcase K ip 5 of fer ,• 



[(6, (K^»/A]p 



. Similar to previous case. 



S3 /[A] 



p : k 



/(*)—► [*/A] p 



Similar to previous case. 



T h 7 : D ((oi : T Ki) ri) ~ ((02 : T k 2 ) ->■ r 2 



7o = sym (left 7) 



7l = 7 @(coh (r) 70) 



V> 7 ) r — ^v T (r> 7o )>7i 



If Afc( (v D> 7) r ~ (v > 7) r) then the definition gives A k ( v ~ v), A k (r ~ r) and 
Afc_i(7 : (ai : T K\) — >• ri ~ (02 : T ^2) — > T2). Hence Afc_ i( v > 7 ~ v). Now 
A fc _i(7o : k 2 ~ «i), so A fc _i(r ~ r>7 0 ) and A fc _ 2 (7 1 : [r>'j 0 /a 1 ]T 1 ~ [r/a 2 ] r 2 ). 
Thus A fc _ i( (v > 7) r ~ v (r > 7 0 ) > 7i). 
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Case 



rh 7 P (( Ci: ° v?1 )^r 1 )~(( C2 : n V 9 2 )^r 2 ) 
70 = sym (left 7) 71 = j@(r] > 70, 77) 

(v>7) D ^ — > v D (r]>7o) >7i 



. If A fc ((v>7)r] ~ ^7)77) 



then A fc ( v ~ v), A fc _i(?? : <p 2 ) and A fc _i(7 : (ci : D </?i) ->■ n ~ (c 2 : D <p 2 ) ->■ r 2 ). 
Hence A fc _i(v>7 ~ v). Now A fc _i(7 0 : <p 2 ~ y^), so A fc _ 2 (r]>7 0 : y?i) by 
Lemma D.12, and Afc_ 2 ( 7 i : [r] > 70/ Ci] Ti ~ [77/ c 2 ] r 2 ). From this it follows that 
A fc _i( (v> 7 ) 77 ~ v (?7>7o) >7i). 



Case 



r h 7 : D ((ai: A «i) ->n) ~ ((a 2 : a k 2 ) ^ r 2 ) 
70 = sym (left 7) 71 = right 7 

(v D> 7) A p — > v k (p D> 70) > 71 



If A fc ( (v>7)p ~ iy>i)p) 



then A fc (v ~ v), A k (p ~ p) and A fc _i(7 : (a x : A K4) ->• Ti ~ (a 2 : A k 2 ) ->• r 2 ). 
Hence Afc_i( v>7 ~ v). Now Afc_i(7o : /t 2 ~ Ki) and A£_ 2 (7i : Ti ~ r 2 ). Hence 
A fc _i(p ~ p>7o) and so A fc _i( (v > p ~ v (p>7 0 ) >7i). 



Case . If A fc ( (v>7)>7 ; ~ (v>7)>7') then A fc ( v ~ v), 

(v > 7) > 7' — >■ v > (7; 7') 

A fc _i(7 : t 0 ~ n) and A fc _i(7' : n ~ r 2 ). Transitivity gives A fc _i(r 0 ~ r 2 ) and 
hence A&( (v > 7) > 7' ~ v > (7; 7')). 



Case 



T h 7 : D Dt; 
S 9 K :* 



\A) Dai 



u = (T h v h nth 8 7) : a; 



If A fc ((Kr i i 5) >7 ~ (Kt^S) >7) then the definition gives A fc (Kiy <5 ~ Kr; J '5) 

: i i 

and A fc _x(7 : ~ De^'). Lemma D. 10 gives A fe _ i((r,-, t>j, nth ? 7) : a { : v ) 

= i i 

and A^_ i((rj, nth* 7) , a; : a; : v , A) follows from the definition on coerced 
types since telescoped coercion extension appends coerced copies of types. Hence 
Afc_i(Ki7 4 to ~ Kvi 1 at) and so A k _i((KTi % 6) >7 ~ KtJ7* at) as required. □ 

To prove congruence for case analysis, I need that whenever an expression is 
equivalent to an applied constructor, the expression reduces to the same head 
constructor (possibly under a coercion). This follows from the definition of com- 
patibility on structural expressions. 
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Lemma D.13. If A fc (H S ~ r) then either r — >* H5' or r — >* H 5' > 7, and 
A fc (H5 ~ H5'). 

Proof. By induction on the length of 5 and the structure of r. □ 

Lemma 6.17 (Congruence for case analysis). If Ak(e ~ e') and Ak(bri ~ 6r'-) 
/or a// i, £/ien A /t ((d)casee of 6r/ ~ (d)casee'of fer^). 

Proof. By induction on fc, e and e'. 

If £ kpush > Eq and e' kpush > e' 0 , then Lemma 6.16 (page 135) and transitivity give 
Afc_i(£:o ~ £0), and A^- 1 ((d) case e 0 of hrC ~ (d)casee' 0 of 6r'/) by induction, 
so the result follows. 

Suppose without loss of generality that e cannot step, then by Lemma D.ll 
either k = 0 (and the result is trivial) or e is a value. It cannot have an outer- 
most coercion, since Lemma D.13 ensures the case scrutinee push step would be 
applicable. The canonical forms lemma (Lemma 6.12) means that e = KTj l S. 
By Lemma D.13, e' — >* Krj* 5' and A^Krpo" ~ K^fV). 

For case expressions, there are K A 0 — > t 0 G br^ and K Aq — > t' g e hr\ \ so 

caseK^f o"of bT/ — ► [<S/A 0 ] r 0 and case K V] i 5' of 6r] ' — ► [(J'/AJJ r^. 

Moreover A fc (KA 0 ->■ r 0 « KA^ -»• r£) gives A fc ((A 0 A\ A{,) -»• (r 0 ~ r')). 
Instantiating this with <5 and <5' yields A fc _!([5/A 0 ] r 0 ~ [5'/A' 0 ] Tq), so 

A fc (case K 77 * 5 of 6r f * ~ case K rj * 5' of or'- *). 

Now Afc(case e' of or'- * ~ case K r- * 5' of or', *) follows from Lemma 6.16 since the 
left side reduces to the right side, so A fc (caseeof or/ ~ cases' of br'/). 

The argument for dcase expressions is similar: 5 and 5' are replaced with 
S, (K Tj 1 5) and 5', (K rj 1 5'); proof irrelevance means nothing more is needed. □ 

If 5 is a vector then let ((5)} be the telescoped coercion with \(5)) = 5 = ((sj) 
and the coercion proofs given by reflexivity. Note that T \- 5 : A is equivalent to 
T P c ((5)) : A. 

Finally, I can prove the key result, that well-typed coercions are compatible. 
This is a massive mutual structural induction on typing derivations, using the 
preceding results. Unlike most of the previous results, however, k is quantified 
inside the inductive hypothesis, because some cases need to increase it when 
making appeals to induction. 
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Lemma 6.19 (Basic Lemma). 

(a) IfT\-r : v k then for all k, A k (u 0 : T) implies A k ([^,/T] r ~ [^o/r] r). 

(b) If T h br : v v ► r or V h 6r : v (e : u) ► r £/ien /or a// A;, A fc (w 0 : T) implies 
A fc ([^/r] br » [4/r] 6r). 

(^cj 7/r h 7 : D (/? then for all k, A k (u 0 : T) implies A k ([tu^/r] <p) and A k ([u^/T]ip). 

(d) J/rP c u : A t/ien /or a// A/ £ (wo : T) implies A k {\uo /Y]u : A). 

Proof of part (a). Fix and u 0 such that A fc (w 0 : T). Proceed by induction on 
the derivation of T h r : v k to show A fc ([^/r] r ~ [^o/T] r). 



Case 



r h ctx 

r 3 a : /t 



. Here A fc (w 0 : r) gives A fc ([c^/T] a ~ [uq/T] a) 



rrr> 



Cases 



r h ctx 

E 9 D : v k 
T h D : v 



K 



and 



r h ctx 

rhK:*/t 



Trivial. 



Case 



S9/[A]:*k rh«5 : A//f $ ^ 
r h /(5) :* [5/ A] K 



Let w = [u 0 /T]((5)) so 



that to = [tj~ 0 /r] 6 and l3 = 5. Then the goal A k (fyr] f(6) ~ [4/r] /(<$)) 

is A fc (/(tF) ~ /(at)). Now f(tl) — ► [t7/A]r and /(^) — ► r where 

E 9 / [A] = r :* k. Moreover, induction using part (d) gives A k _i{uj : A), and 
Ajfc_i([t7/A]r ~ 0/A]r) follows since the function definition is good. Hence 
A fc (/(tr) ~ /(at')) as required. 



Case 



rhp :* (a:*Ki)->/c 2 r hp' :*^* Kl 

r h pV :>i> [p7«] k 2 



By induction, A*( [a^/r] p ~ [a^/r] p) and A*( fe/r] p' ~ [4AV). Now 
Afc_i([aVo/r] ((a :* «i) — > k 2 ) ~ [^o/r] ((a :* «i) — )• k 2 )) by Lemma 6.18, so 
the definition on quantifiers gives A k _i([ou 0 /r] ([p' / a] k 2 ) ~ [o; 0 /r] ([p'/a] k 2 ))- 
Hence A&( [a^o/r] (pp') ~ [a^o/r] {p p')) as required. 
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Case 



T\-k : v * T, a:*/thr : v * 
T h (a:*/c) r : v * 



To show A fc ([^/r] ((a :* k) -»■ r) ~ fio/T] ((a :* k) -»• r)), first observe that 
Afc([^/r] k ~ [wo/r] /t) follows by induction. Suppose I < k, v and i/ with 
), then Ai([v/a] fyr] r ~ [v'/a] [ujq/T] r) also follows by induction. 



Case 



rhp:* k rh7: n K~ K ' * ^ □ 



T hp> 7 :* k' 



Here A^( [cuo/T] p ~ [ujq/T]p) by induction, and Afc_i([a;o/T] (/c ~ «/)) and 
A fc _x([^/r] (« ~ «')) by part (c). Hence A fc ( [^/T] (p> 7 ) ~ [tf 0 /T] (p> 7 )). 



Cases 



r h ctx 






and 


T h * : v * 





r h ctx 



r h (~) : v (a: v *) -»• (6: v *) a -)• 6 ^ * 



.v. 



Trivial. 



Case 



rhp:% * ^ □ 

rh6r 0 :*v + t ... Th6r, 



f ► r 



r h casepof 6r 0 ... br n :* r 



By induction, using part (b), 



and congruence for case analysis (Lemma 6.17). 



Case 



The : n //* v * ^ □ 

T h 6r 0 :* (e : u) ► r ... T h 6r n :* (e : v) ► r 



T h dcase e of br 0 . . . br n 



T 



As previous case. 

□ 



Proof of part (b). Fix k and wo such that Afc(w 0 : r). Proceed by induction on 
the derivation to show A k ([tu^/r] br ss [wo/T] 6r). 



Case 



£ 9 K : (oi : v «< , A) -> DoT 
T, [Vi/di 1 ] A hp :* r 
rhr: v * $ ^ 



rhK(K/ fli ]A)^p : w D^>r 



First let A' = [ i;,-/ a 8 - ] A and 



A" = [tJ- 0 /r]A> /X\ [4/r]A'. The goal is A fc ( (A") fc/T] p ~ [^/T]p). 
Equivalently, suppose w is such that A k (u 0 ,u : r,A'), then it suffices to show 



Afc( [(wq) w )/r, A'] p ~ [(wq, oj)/T, A'] p), which follows from part (a). 
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Case 



E9K:* (m : v «< , A) DoT 

A' = [vPfa^A // n, c : D e ~ (K^*'A) 



r,A'hp :* r Thr 



$ ^ n / * 



rhKA'^p :*(e: Dn i i ) ►t 



Similar. 



□ 



Proof of part (c). Fix fc and ooq such that Afc(wo : T). Proceed by induction on the 
derivation of Y h 7 : D 9? to show A fe ([o>o/r] </?). In each case, it is straightforward 
to further show A k ([ou 0 /r] ip). 



Case 



T3 c: D 




r h ctx 




The: 0 





Here the definition of A k (u 0 : T) gives A fc ([^/T] ip) 



Case 



rh 7 : D (o:*k) ^ 



r : /t 



$ ^ □ 



r h 7 <1> r : D [r/o] </? 



Here A fe ([^/r] ((a :* re) ->• p)) by induc- 



tion, and part (a) gives A k ([tJo /T]r ~ [wo/T]r), so A k ([tj^/F] [r/a](p) follows 
immediately from the definition. 



Case 



rh 7 : D (cPip') ^p 
Thr] : D ip' 

T V-Pt] -P [r]/c]ip 



Induction gives Afc([^/r] {c-P V ') ->• v?) from the 



first hypothesis and A fe ([w 0 /r] ip 1 ) from the second, so A k ([oj 0 /T] [77/ c] ip) follows 
from the definition. 



Case 



r,a:*reh 7 : u r 



rhAa: $ ft.7 : D (a:*re)^r 



First suppose <£> 7^ □, and let r be such 



that • h r : v re and A^(r ~ r). Induction gives Afc([(&o, r)/r, a : * ^V 9 )) so 
Afc([^/r] ((a:* re) — > ip)) as required. The case $ = □ is similar. 



Case 



r h ctx 



E 3 C : a ip 



fhC:> 



Here the goodness of £ gives Afc(<£>) and 



hence A fc ([£^/T] ip) since p> is closed. 
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Case 



T h tc u : A 



r.Ahr : k 



r h resp wAr : D [t7/A] r ~ [Z?/A] r 



Part (d) gives A k ([u 0 /T]u : A), 



then part (a) gives A k ([tj^/F] ([to /A] r ~ [at/A] r)) as required. 



Case 



r h 7 : D tt'^w' 
r h left 7 : D r ~ v 



. By induction, A k ([tu^/F] (tt' ~ uu')), and hence 



A fc ([tbo/r] (r ~ f )) by definition. 



Case 



Case 



T h 7 : D rr' 



f v 



T h right 7 : D r' ~ i>' 



. Similar to the previous case. 



rh 7 : D ((ai:^ 1 )^r 1 )^((a 2 : cp / t2 )^r 2 ) 
T h left 7 : D «i ~ /t 2 



By induction, A fe ([a^/r] (((ai :* «i) — >■ ri) ~ ((a 2 k 2 ) — >• t 2 ))), and hence 
A fc ([tbo/r] ~ k 2 )) by definition. 



Case 



T h 7 : D (k ± ri) ~ (k 2 ->• r 2 ) 



T h right 7 : D T\ ~ r 2 

A fc ([^/r] ((«i ri) ~ (k 2 r 2 ))), so A fc ([^/r] (n ~ r 2 )) by definition. 



. Here the inductive hypothesis gives 



Case 



r h 7 : D (n : (oi : T «i) ->• <) ~ (r 2 : (a 2 : T « 2 ) ->• «' 2 ) 
r h 77 : D (i>i:/ci) ~ (v 2 :k 2 ) 

T h conga T 7 77 : D (ti ~ (r 2 v 2 ) 

By induction, A fc + 1 ([tt^/r] (ri ~ r 2 )) and A fe ([o^/r] (t^ ~ i; 2 )). Moreover 
Lemma 6.18 gives A fc ([o^/r] (((ai :* /ci) — >■ k^) ~ ((a 2 :* re 2 ) — >■ « 2 ))) and hence 
A fc _i([o^/r] ([ui/01] «i ~ [^2/02] 4))- Thus A fc ([^/r] (nui ~ r 2 v 2 )) as re- 
quired. Note that this case relies on the fact that k is universally quantified 
inside the inductive hypothesis. 



Case 



T h 7 : D (n : (ci : D </?i) ->• «i) ~ (r 2 : (c 2 : D y? 2 ) -»• « 2 ) 

r h 771 : D <pi rhi? 2 : D y? 2 

r h conga D 7 (771, r] 2 ) : D (n 771) ~ (r 2 772) 



Similar to previous case. 
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Case 



r, ai : T K\ h T\ : V * 

.T 



r, a 2 : k 2 h r 2 : * r h 77 : u «i ~ k 2 



r h 7 : D (ai : «i, a 2 : «2, c : D a x ~ a 2 ) ->■ Ti ~ i~ 2 



r h cong T 77 7 : n ((01: 1 «i) ->■ 77) ~ ((a 2 : T /t 2 ) ->• r 2 



A fc ([^/r] («i ~ re 2 )) and A fc ([^/T] ((ai :* «i, a 2 :* /c 2 , c : a 01 ~ a 2 ) ->• Ti ~ r 2 )) 
by induction. Hence A k ([tj^/r] (((ai:' I> Hi) — >■ n) ~ ((a 2 :*K 2 ) — >■ ^2)))- 



r, Cl : D v^i I" ri : V 

T h : D ~ (/9 2 



T, c 2 : D v? 2 h r 2 



.v 



T h 7 : D (ci : D y?i, c 2 : D y? 2 ) -> n ~ r 2 



Case 

T h cong □ 777 : D ((c!: D v?i) -»• 77) ~ ((c 2 : D v? 2 ) ->■ r 2 ) 
Similar to previous case. 



Case 



T h 7 : u e 



T h 7/0 : D 6r 0 ~ br' 0 ... T h 7/ n : n 6r n m br' n 



T h (cong (d)case 777, *) : D ((d)caseeof 677 ) ~ ((d)casee' of br\ ) 
By induction and Lemma 6.17. 



Case 



r h 7 : D if 

r h 77 : D ip ~ ip' 

r h 7>7] : D v?' 



. By induction, A fc ([£^/T] </?) and A fc ([S^/T] (</? ~ </?'))■ 



Then Lemma D.12 gives the required result. 



Case 



rh 7 : D ((d^/d) ->n) ~ ((a 2 : T K 2 ) ^ r 2 ) 
T h 77 : D ~ (v 2 :k 2 ) 

T h 7@r7 : n [7^/ 01] 77 ~ [v 2 / a 2 ] t 2 



Here induction gives A fc + i([^/r] (((ai : T Ki) — >■ n) ~ ((a 2 : T /c 2 ) — >■ t 2 ))) and 
A k ([fc~ 0 /T] (v 1 ~ tj 2 )), so by definition, A fe ([^/T] ([77/01] n ~ [7j 2 /a 2 ] r 2 )). 



Case 



rh 7 P (( Ci: ° v?1 )^r 1 )~(( C2 : n ^2)^T- 2 ) 



rh77 2 



<P2 



r h 7@(t7i, 77 2 ) : d [t/i/ci] n ~ [t7 2 /c 2 ] r 2 



. Similar to previous case. 



Case 



r h 7 : D (n:Ki) ~ (r 2 :/c 2 ) 
T h 77 : D ~ 7J 

T h coh 7 77 : D 77 > 77 ~ t 2 



. By induction, AfcQ^/r] (17 ~ r 2 )) and 



A fc _i([a;o/r] (ki ~ k 2 )). Hence A fc ([£o/r] (77 >rj ~ r 2 )) as required. 



4- 
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Case 



rhr: V K 



r h t' : v k r 
T h stepr : D r ~ r' 



kpush 



Here induction using part (a) gives Ak + i([c%/T] (r ~ r)), and Lemma D.9 gives 
[^/r] r ^> [fcjj/r] 7-', so Lemma 6.16 gives A fc ([^/F] (r ~ r')). 



Case 



T h 7 : D (ri:«i) ~ (t 2 :k 2 ) 
T h kind 7 : D K\ ~ k 2 



. By induction and Lemma 6.18. 



□ 



Proof of part (d). Fix k and u 0 such that Afc(wo : T). Proceed by induction on 
the derivation of T h tc u : A to show A k ([u 0 /T]u : A). 



Case 



r h ctx 



r h 



tc 



Trivial. 



Case 



r h tc w : A r h 7 : D r ~ v 

r h r : T [tT/A] « rhu: T [^/A] k 

Th tc (u;,(T,t;, 7 )) : (A, a : T k) 



. Here A fc ([w 0 /r]w : A) by 



induction, and A fc ([£^/T]r ~ [o»o/r]f) and A k ([cu 0 /r]r ~ [uo/r]v) from part 
(c). Moreover A k (fyr] t ~ [o;o/r]r) from part (a). Hence symmetry and 
transitivity give Afc([&o/r] r ~ [^o/T] f) as required. 



Case 



T h tc u : A 
r h 77 : D [t7/A]v? 
T\-rj' : a [rf/A]tp 
Fh tc (c^, (77,77')) : (A,c: n 



. Let to' = u 0 , [uj 0 /T]u. By induction, 



AfcOoAV : A )> A fc _ 1 ([a7/r, A] <p) and A fc _i([a?/T, A] </?) as required. 



Case 



r h tc u : A 



T h e : A [t7/A] r 
r h e' : A 0/A]r 
rh tc ( W ,(e,e')) : (A, x : A r) 



. By induction, A fc ([a;o/r]a; : A). 



□ 
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